By Charles Moore,2014-04-12 15:22
12 views 0

Journal of Chinese Language and Computing 16 (2): 99-119 99

    Lexicon as a Generating System: Restating the Case

    of Complex Word Formation in Chinese

    Yuanjian He

    Department of Translation, Chinese University of Hong Kong, Shatin, N.T., Hong Kong

    Submitted on 12 March 2004; Revised and Accepted on 26 April 2006


    Part of the speaker’s grammatical competence is his/her ability to access and generate well-formed morpheme strings which we call words. But not much attention has been paid to treating the Chinese lexicon as a generating system. Part of the difficulty was that before Chomsky’s (1993/1995) framework of Generalized Transformation (GT), there had been a

    lack of a unified computational system in general for all components of grammar, i.e. lexicon, syntax, logical form and phonetic form. The current study thus tries to apply GT to Chinese complex word formation from the perspective of the Chinese lexicon operating as a generating system. Fresh findings are: a) GT does work for Chinese morphology; b) the canonical word order in Chinese morphology is a mirror image of that of its syntax, e.g. (O)VS vs. SV(O) and OV vs. VO; c) Chinese has lexico-syntactic compounds whose syntactic components are a VP that is first generated in syntax and is then looped back to the lexicon to occur as stems in compounds; and d) traditional Chinese “disyllabic compounds of both syntactic and lexical

    origin have already evolved to become root words due to the likelihood that the disyllabic phonological/prosodic pattern might have triggered the cognitive process in which rule-generated constructs are commuted to memory-stored root words.


    lexicon, complex word formation, commutation between computation and memory

1. The Components of the Lexicon

Part of the speaker’s grammatical competence is his/her ability to access and generate well-

    formed morpheme strings which we call words (a string ? one morpheme). To access is to

    draw on root words stored in the lexicon, and to generate is to construct complex words from

    roots and affixes, resulting in compounds, words with derivational or inflectional affixes, words with reduplicated roots, and even abbreviated word forms. Descriptively, the lexicon represents the speaker’s grammatical competence on producing well-formed words, and it

    consists, according to Kiparsky (1982) and Pinker (1999: 205), of three components, with relations to syntax as shown in (1):


    Yuanjian He 100


     Complex x SyntaWord Formation



    Every inter-channel in the lexicon is eventually linked to syntax. First, roots may go directly to syntax. If not, they go to complex word formation or to regular inflection. Second, the output of the complex word formation, i.e. complex words, will either go to syntax or to the regular inflection. Third, the outcome of the regular inflection, i.e. words with inflectional morphemes, will also go to syntax.

    Pinker (1999) stipulates that roots are word forms that have been memorized, and to access them is to access the relevant part of the memory system. In contrast, complex words, including words with regular inflectional affixes, will be generated by rules, a process which is the computing act of the mind (ibid.).

    As a rule, monosyllabic words in Chinese are roots (e.g. Chao 1968). For the present study, I also take disyllabic words as being root-like and no longer requiring computing for structure-building, see Section 5.0. Thus, I am concerned primarily with how multi-syllabic complex words are generated in Chinese.

    Section 2.0 introduces the system of Generalized Transformation (GT) (Chomsky 1993/1995), followed by discussion of major cases of Chinese complex word formation under GT in Section 3.0. Special cases of Chinese synthetic compounding are analyzed under the loop theory of Pinker (1999) in Section 4.0, and Section 5.0 presents arguments for what has been traditionally called “disyllabic compounds” being designated as root words instead. The

    conclusion is dealt with in Section 6.0.

    2. Generalized Transformation as a Unified Computational System for Grammar

The issue of how to capture the speaker’s grammatical competence in producing well-

    formed words is reducible to how grammar generates lexical structures with a set of rules on a context-free and non-redundant basis. System-wise, computational rules, i.e. rules for constructing structures, ought to be the same for every component of grammar, namely, the lexicon, syntax, the phonetic form (PF), and the logical form (LF).

    The quest for a unified computational system in the past has led to studies adapting the X-bar rule schema for application in morphology. This is seen with Chinese linguists, e.g. Tang (1982, 1988, 1993, 1995; in part 1991a-b), Dai (1992, 1997), Sproat & Shih (1996) and Packard (2000), as well as with authors investigating other languages, e.g. Selkirk (1982), Scalise (1984), Sadock (1991) and Sadler & Arnold (1993).

    However, certain properties of the X-bar rule schema make it difficult to apply in morphology. One is its false representational generality, as seen in Chomsky (1970, 1981, n n-11986a-b) and in Jackendoff (1977). For instance, “X? X (YP) (order irrelevant)” has to n n-1n n-1n n-1entail a series of rules like X? X, X? X YP, and X? YP X, much the same as

Chinese Lexicon as a Generating System 101

    the early phrase structure rules, e.g. those of Chomsky (1965). Such redundancy is less a problem for syntax than morphology, where a lexical structure has a limited capacity, more limited than a XP, and requires a more precise and yet the same context-free generating capacity as in syntax.

    Another property of the X-bar rule schema is its uniquely-defined hierarchy. Each level of a XP (= X”), for instance [ X YP], [ X’ YP] or [ X’ YP] (order of constituents X’X’XP

    irrelevant), represents a unique structural relation between constituents of that level and the head-of-phrase, namely, the relationship of complement-head, adjunct-head, or specifier-head. Such a hierarchy, denoted by bar-notations, is not universally retainable in a lexical structure, again for its more limited capacity than a XP.

    The tension between the want of a unified rule system for structure-building and the failure of the existing systems, such as the X-bar rule schema, has sometimes resulted in a total collapse of the rule schema. The best example is Packard’s (2000: 168) convoluted lexical rules for Chinese:

     -0 -0 / -1 / {W}-0 / -1/ {W}(2) a. X? X X -0 -0b. X? X G

    Instead of representing a morpheme (free or bound) of a syntactic category (N, V, A, P, etc.), X in (2) is associated with a primitive class, e.g. root, bound root, affix, and so on. In addition, -0 -1 there is a mixture of arbitrary bar-notations and primitive symbols. E.g. X= root, X= bound W-0root, X = affix, G = grammatical affix, {} = selected for once only, and so on. Also, only X is

    allowed to recur, but not others. As a result, the rules are almost idiosyncratically context-sensitive, and no longer have a context-free generating capacity required of computational rules, which by definition apply across the board to any item that calls on the rules to process it (Chomsky 1993/1995, Pinker 1999). Other defects include misconceived bar-notations and lack of principles of describing the category of a lexical structure formed by these rules. But, as well-documented in a number of studies, if a lexical structure has a head, the head should determine the category of that structure (e.g. Williams 1981, Di Sciullo & Williams 1987, Katamba 1993).

    This tension did not go away until the system of Generalized Transformation (GT) was introduced in Chomsky (1993/1995). GT replaces the rule schema but retains the X-bar format. As a result, the fore-mentioned previous representational redundancies are dissolved and relevant hierarchy reduced to a level acceptable to morphology. Though it has mainly been experimented on syntax and LF, and its application to lexicon largely untested, GT was introduced in the spirit of serving as a unified computational tool for grammar. It operates in two phases: projection and merge (Chomsky 1995: 189-190). Projection is a concept as well as a technical operation. For a lexical item X, once drawn into computation (i.e. structure-building), it will project into a hierarchical constituency that immediately dominates itself. Namely:

(3) X ? [ X] X

    ?” denotes projection, and it also denotes merge, movement, or deletion, depending on the structural and operational context. It is a general symbol for transformation, so to speak. It is assumed in Chomsky (1993/1995) that projection and merge occur in morphology and syntax, movement in syntax and LF, and deletion probably in PF.

    Yuanjian He 102

    In syntax, X = N/A/V/P/etc. Further projections will conform to relevant bar-format, e.g. [ X] (X = [ X]), [ X’] and [ X’] (XP = X”) (Chomsky 1995: 189). X’XX’XP

    In morphology, X = morpheme of a category of N/A/V/P/etc. A morpheme is either a root or an affix. Further projections will not conform to the bar-format, e.g. [ [ X]]. XX

    Now consider merge, which subsumes two operations: insert a primitive position in a targeted structure, and substitute the primitive position with another structure. Suppose that [ X] is targeted for merge. It will further project into a structure containing X

    [ X], and then a primitive position “0” is inserted and immediately substituted by another X

    structure Y:

(4) a. [ X] ? [ [ X]] XXX

    b. [ [ X]] ? [ [ X] 0 ] or [ 0 [ X]] XXXXXX

    c. [ [ X] 0 ] ? [ [ X] Y ] XXXX

    d. [ 0 [ X] ] ? [ Y [ X]] XXXX

    The head-parameter decides to which side of X, i.e. left or right, Y is merged. In other words, language-specific conditions dictate whether the merged structure is head-initial, like (4c), or head-final, like (4d).

    Note that by virtue of the fact that it is [ X], not Y, that is targeted for merge, hence X, not X

    Y, is therefore the head of the merged structure. In effect, by appropriate targeting, the merge operation automatically determines the category of a branched structure, making it either left- or right-hand headed.

    The process in (4) applies either in syntax or in the lexicon. In syntax, it is appropriately applicable to constructs like Verb-Aspect clusters in Chinese, where aspect markers are merged with a verb, such as [ V Asp], which then continues to project till it forms a VP. V

    In the lexicon, the process simply generates lexical structures. Given that Y is a structure itself, i.e. [ Y], [ [ Y] Z] or [ Z [ Y]], in which Z is a also a structure, the merged YYYYY

    structures in (4c-d) would therefore represent the following:

     c. X d.