Current models of spoken word recognition have been predominantly based on studies of Indo-European languages. As a result, less is known about the recognition processes involved in the perception of tonal languages (e.g., Mandarin Chinese), and the role of lexical tone in speech perception. One view is that words in tonal languages are processed phonologically through individual segments, while another view is that they are processed lexically as a whole. Moreover, a recent study claimed to be the first to discover an early phonological processing stage in Mandarin (Huang et al., 2014). There seems to be a lack of investigations concerning tonal languages, as no clear conclusions have been reached about the nature of tonal processes, or a model of spoken word recognition that best incorporates lexical tone. The current study addressed these issues by presenting 18 native Mandarin speakers with aural sentences with medial target words. These either matched or mismatched the preceding visually presented sentences with medial target words (e.g, 家 /jia1/home). Violation conditions involved target words that differed in the following ways: tone violation, where only the tone was different (e.g., 价 /jia4/“price”), onset violation, where only the onset was different (e.g., 虾 /xia1/“shrimp”), and syllable violation, where both the tone and the onset were different (e.g., 糖 /tang2/“candy”). We did not find evidence for an early phonological processing stage in Mandarin. Instead, our findings indicate that Mandarin syllables are processed incrementally through phonological segments and that tone is strongly associated with lexical access. These results are discussed with respect to modifications for existing models in spoken word recognition to incorporate the processes involved with tonal language recognition.