Paris 2009

By Jeremy Rodriguez,2014-03-02 19:31
Paris 2009

     Lexical Categories in Persian

     Gh. Karimi-Doostan, University of Kurdistan

     ICIL3, Universite Sorbonne Nouvelle, Paris, September 11-13, 2009

     Email address:

    1. Introduction

     Salish, Mundari, Munda, Kharia and Manipuri lexical categories - noun, verb, adjective/adverb and preposition (Bhat, 2000; Croft, 2005a; Hengeveld and Rijkhoff 2005; Wiltschko, 2005) vs.

    Persian lexical categories such as adverb, adjectives, prepositions, nouns and verbs which can be distinguished from each other according to their morphosyntactic behaviors and distributional criteria.

    2. Persian Word Classes

    2.1. Common Word Classes

     Persian traditional grammarians, mainly influenced by works on Arabic, English and French languages, have taken into consideration word classes in this language. There are three categories of nouns, verbs and articles in Taleghani (1926) and Najmghani (1940). Khayampour (1965) believes that Persian parts of speech are nouns, verbs, adjectives, adverbs, minor sentences and adjuncts. In Khanlari (1976) the language has seven parts of speech including nouns, verbs, adjectives, pronouns, adverbs, articles, prepositions and interjections. Gharib et al (1984) add the category of numbers to Khanlari‟s word classes. According to Anvari and Givi (1996), there are

    seven word classes in Persian: nouns, verbs, adjectives, pronouns, adverbs, articles, prepositons and minor sentences. Moshkatodini (2005) recognizes five common lexical categories of nouns, verbs, adjective, adverbs and prepositions in the language. He shows that these categories can be differentiated from each other by morphosyntactic and distributional tests.

     Although Croft (2005b: 435) in line with Schachter (1985: 5-6), Croft (1991: 45-46), Croft (2001: 75-83) and Eavan and Osada (2005) criticizes the distributional method of lexical word categorization, he states that the distributional method is the fundamental empirical method of grammatical method and there is no other.

    2.1.1. Adjectives / Adverbs

     (1a) Simple form Comparative Form Superlative form

     tamiz 'clean' tamiz-tar tamiz-tarin

     xarâb 'ruined' xarâb-tar xarâb-tarin

     (1b) ?otâq-e tamiz „the clean room‟

     (1c) xeili tamiz „very clean‟

    (1d) Simple Form Comparative Form Superlative Form

    pâ?in „down, under‟ pâ?in-tar pa?in-tarin

     bâlâ „above, up' bâlâ-tar bâlâ-tarin

     (1e) ?otâq-e bâlâ „the upper room

     (1f) xeili bâlâ

     (2a) xošbaxtâne Ali sar vaqt rasid.

     fortunately Ali on time arrived.

     „Fortunately, Ali arrived on time.‟

     (2b) Ali tond midavad.

     Ali quickly runs

     „Ali runs quickly.‟

    2.1.2. Prepositions

     (3a) ?az madrese-ye man „from my school

     (3b) dar xâne-ye bozorg „in the big house

2.1.3. Nouns

     (4a) da'vat-hâ-ye ziyâdi dâšt-im. st Invitation-pl-Ezafe a lot had-1 Pl.

     „We had a lot of invitations.‟

     (4b) da‟vat-e xoob bood.

     Invitation-definite good was

     „The invitation was good‟

     (4c) ?in da‟vat barâ-ye če bood

     this invitation for what was

     „What was this invitation for?‟

     (4d) ?az da‟vat porsid.

     from invitation asked

     „He/she asked of the invitation.‟

     (4e) da‟vat-e ali az man

     invitation-Ezafe Ali from me

     „Ali‟s invitation of me‟

     (4f) ?in da‟vat kâr-o xarâb kard.

     this invitation work-SOM ruining did

     „This invitation ruined the work.‟

     (4'a) ?âmadan o raftan-hâ-ye ziyâdi dâšt-im. st coming and going -pl-Ezafe a lot had-1 Pl.

     „We had a lot of comings and goings.‟

     (4'b) davidan-e xoob bood.

     running-definite good was

     „The running was good‟

     (4'c) ?in davidan barâ-ye če bood

     this running for what was

     „What was this running for?‟

     (4'd) ?az davidan porsid.

     from running asked

     „He/she asked of running.‟

     (4'e) davidan-e ali

     invitation-Ezafe Ali

     „Ali‟s running‟

     (4'f) ?in davidan kâr-o xarâb kard.

     this running work-SOM ruining did

     „This running ruined the work.‟

    2.1.4. Verbs

     (5) man nâme-ro nevešt-am st I letter-SOM wrote-1 sing

     „I wrote the letter.

2.2. Mismatching Words


    Table 1

     WordsMo’aser Mo’in Dictionary Sokhan

    Dictionary Dictionary hefz „memorizing‟ Noun Transitive Masdar Masdar Noun farâham providing‟ Noun Subject Noun Masdar Noun ?anjâm performing Noun Noun Masdar Noun ?edâme „continuation‟ Noun Transitive Masdar Masdar Noun taqlil „reduction‟ Noun Transitive Masdar Masdar Noun hedâyat „advising Noun Transitive Masdar Masdar Noun tahye „providing‟ Noun Transitive Masdar Masdar Noun basij mobilizing Noun Noun Masdar Noun tarahom pity Noun Intransitive Masdar Noun Parvareš „fostering‟ Noun Masdar Noun Noun

     Table 2

     WordsMo’aser Mo’in Dictionary Sokhan Dictionary


    farâmuš forgotten adjective noun adjective mahsub „considered adjective subject noun adjective moraxas „release‟ adjective object noun adjective vâdâr „persuading‟ adjective noun objective adjective gom „losing‟ adjective adjective adjective mahsur „surrounded‟ adjective object noun adjective maslub crucified adjective object noun adjective vâžgun „overturned‟ adjective adjective adjective fâyeq „overcome‟ adjective subject noun adjective

2. 2.1. Predicative Nouns

     (6a) * ?anjâm-hâ-ye ziyâdi dâštim. st performing-pl-Ezafe a lot had-1Pl.

     „We had a lot of performances.‟

     (6b) * ?anjâm e xoob bood.

     performing-Def. Art. good was

     „The performing was good‟

     (6c) * ?in ?anjâm barâ-ye če bood

     this performing for what was

     „What was this performing for?‟

     (6d) * ?az ?anjâm porsid.

     from performing asked

     „He/she asked of the performing.‟

     (6e) * ?in ?anjâm kâr-o xarâb kard.

     this performing work-OM ruining did

     „This performing ruined the work.‟

     (7a) ?anjâm -e kâr

     performing -Ezafe work

     „The performing of the work‟

    (7b) ?az ?anjâm-e kâr porsid.

     from performing-Ezafe work asked

     „He/she asked if the work had been done.‟


2.2.2. Predicative Adjectives

     (8a) Simple forms Comparative Forms Superlative Forms

     farâmuš „forgetting‟ * farâmuš-tar * farâmuš-tarin

     mahsub „considering * mahsub-tar * mahsub-tarin

     (8b) * ketâb-e farâmuš / gom „the forgotten/lost book‟

     (8c) * xeili farâmuš / gom „very forgotten/lost‟

    3. Mismatching words as independent words

     (9) Ali farâm ne-mi-konad.

     Ali forgetting Neg. Impf-DO

     „ Ali does not forget it.‟

     (10a) Ali farâmuš dârad mi- konad.

     Ali forgetting Prog.Aux. Impf-DO

     „Ali is forgetting it.‟

     (10b) Ali farâm xâhad kard.

     Ali forgetting Fu.Aux. DO

     „Ali will forget it.‟

     (10c) farâm- na-bâyad mi-kardi.

     forgetting-it neg-should Impf.-DO-Past did

     „You shouldn‟t have forgotten it.‟

    In addition to the evidences in (9) and (10), almost all native speakers of Persian consider mismatching words as independent words in their language as they consider different lexical categories in (1-4) as words of their language.

    Karimi-Doostan (1997, 2000, 2005) shows that mismatching words are predicative and argument structure bearing elements. They have lexical aspectual properties and the data in (11-12) indicate that mismatching words have aspectual (telic/atelic) information because some can only co-occur with „in time expressions‟ (cf. 11a with 11b) and some only with „for time

    expressions‟ (cf. 12a with 12b).

    (11a) ?anjâm-e kâr tavasot-e Ali dar panj daqiqe

     performing-Ezafe work by-Ezafe Ali in five minutes

     “The performance of the work by Ali in five minutes”

     (11b) * ?anjâm-e kâr tavasot-e Ali barâye panj daqiqe

     performing-Ezafe work by-Ezafe Ali for five minutes

     “The performance of the work by Ali for five minutes”

     (12a) ?edâme-ye kâr tavasot-e Ali barâye panj daqiqe

     continuation-Ezafe work by-Ezafe Ali for five minutes

     The continuation of the work by Ali for five minutes.”

     (12b) * ?edâme-ye kâr tavasot-e Ali dar panj daqiqe

     continuation-Ezafe work by-Ezafe Ali in five minute

     “The continuation of the work by Ali in five minutes”

     (13) Ali kâr-râ ?anjâm dâd.

     Ali work-SOM performing gave

     “Ali performed the work.”

     (14) ?anjâm-e kâr tavasot-e Ali

     performing-Ezafe work by-Ezafe Ali

     “performing the work by Ali”


     (15) Ali ketâb-râ farâmuš kard.

     Ali book-SOM forgetting did

     “Ali forgot the book.”

     (16) * farâmuš-e ketâb tavasote Ali

     forgetting-Ez book by Ali

     “forgetting the book by Ali”

    Separability and non-separability of Persian Light Verb Constructions as a result of

    syntactic movements such as preposing, topicalization or scrambling has puzzled researchers (Karimi-Doostan, 1997: 192-205; Megerdoomian, 2002; among others) for many years.

     (17) a. Ali ka:r--ra: ?anjâm da:d.

     Ali work-his-DOM performing gave

     „Ali did his work.‟

     *b. Ali ?anjâm-e xub-i (?az) (be) ka:r-aš da:d.

     Ali performing-Ez. good-Indef.Art. (of) (to) work-his gave

     „Ali had a good performance in his work. / Ali did his work well.‟

     *c. Ali ?in ?anjâm-ra: (?az) (be) ka:r-aš da:d.

     Ali this performing-DOM (of ) (to) work-his gave

     „Ali did his work.‟

     *d. ?anjâm-i ke Ali (?az) (be) ka:r-aš da:d mofid bud (VN)

     performing-Indef.Art. that Ali (of) (to) work-his GIVE-Past useful was

     „Ali‟s performing his work was useful.‟

     *e. Ali če ?anjâm-i (?az) (be) ka:r-aš da:d? (VN)

     Ali what performance-Indef.Art. (of) (to) work-his GIVE-Past

     „What sort of performance did Ali have in his work?‟

     *f. ?in ?anjâm-ra: Ali (?az) (be) ka:r-aš da:d. (VN)

     this performing-DOM Ali (of ) (to) work-his GIVE-Past

     „Ali did his work.‟

     (18) a. tegarg be ba:q-e-man latme zad.

     hail to garden-Ez-I damage beat-Past

     „The hail damaged my garden.‟

     b. tegarg latme-ye bad-i be ba:q-e-man zad

     hail damage-Ez. bad-Def.Par. to garden-Ez.-I beat

     „The hail caused bad damage to my gardens. / The hail damaged my garden


     c. tegarg-e diruz ?in latme-ra: be ba:q-e-man zad.

     hail-Ez. yesterday this damage-DOM to garden-Ez.-I beat

     „The yesterday‟s hail caused this damage to my garden‟

     d. latme-?i ke tegarg be ba:q-ha: zad jobra:n na:pazir ?ast.

     damage-Indef.Par. that hail to gardens beat irretrievable is

     „The damage caused by the hail to the gardens is irretrievable.‟

     e- Ali če latme-?i be šoma: zad?

     Ali what damage-Indef. Art. to you beat

     „What loss did Ali cause to you?

     f. ?in latme-ra: tegarg-e diruz be ba:q-e-man zad.

     this damage-DOM hail-Ez. yesterday to garden-Ez.-I beat

     „The yesterday‟s hail caused this damage to my garden‟

    4. Approaches to Lexical Categories

4.1. Lexical and Feature-based Approaches


     As mentioned in Baker (1988), Chomsky (1970) and Foder (1970) initiated the

    development of the so called „Strong Lexicalist

    In the traditional, the so called notional or semantic analysis of parts of speech, the main lexical categories are defined as:

    (19) a. Nouns denote objects (persons, things, places)

    b. Adjectives denote properties

    c. Verbs denote actions

    Langacker (1987) also proposes a conceptual analysis of parts of speech in which a noun is conceptually a thing, a verb is a process and an adjective is a concept construed as relational

     (20) a. noun [+N, -V, -F] b. verb [-N, +V, -F]

     c. adjective [+N, +V, -F] d. preposition [-N, -V, -F]

    Some other researchers have found it necessary to extend Chomsky‟s binary feature

    system and attempted to apply it to functional categories too. For example, Stowell (1981:40) argues that the difference between NP and CP should be construed in terms of tense operator. This distinction is formalized in terms of [?Tense] feature: NPs are [-tense], and CPs are [+tense]. In Fukui (1986), in addition to Chomsky‟s (1970) features, [?Kase] feature is also used and each

    functional head is specified as [+Kase] or [-Kase] and Kase feature is applicable to lexical categories. Transitive and unergative verbs are [+Kase], unaccusatives are [-Kase], prepositions are [+Kase] and nouns and adjectives are [-Kase].

    As noted in Vinokurova (2005), Hale and Paltero (1986) attempt to use lexical categorical features in a hierarchical system. They believe that there is a universal inventory of categorical features and that languages differ as to which features they choose to utilize and which to ignore. Navajo, for instance, does not use the [A] feature because adjectival meanings are stated by verbs. In English, on the contrary, adjectives form a significant class of words, hence the [A] feature must be given a place in the hierarchy. The place a particular categorical feature is given in hierarchy is determined on language-specific grounds.

    Dechaine (1993) proposes that there is an asymmetry between N, V, A and P. In her account N and V are universal and associated with extended projections (the potential set of functional heads dominating a lexical head), but A and P are relatively marginal and they may not have full extended projections of functional heads. She assumes that this asymmetry is not captured by the [?nominal, ?verbal] feature system and presents a feature system which is based on [?functional, ?nominal, ? referential]. The [+referential] categories are [+nominal] N, D and [-nominal] V, T. N and V are categorically selected by functional heads D and T, respectively.

4.2. Syntactic and Constructional-based Approaches

    Marantz (1997) claims that Chomsky (1970) has wrongly been misinterpreted as an approach in which word formation takes place in the lexical component. He argues that word formation takes place in the syntax and initiated this trend of thought in a theory named 'Distributed Morphology' (DM) which was officially launched in a seminal paper by Halle and Marantz (1993) and subsequently developed and explicated in more detail in Halle and Marantz (1994), Marantz (1997), Harely and Noyer (1998) and many others. For DM open classes, lexical morphemes, are ROOTS in a local relation with the category defining F-MORPHEMES (functional morphemes) v, n and a (read as „little v‟ „little n‟ and „little a‟ respectively (Marantz,

    1997; Harley and Noyer, 1998; Harley and Noyer, 2001, Arad, 2003; Wilstchko, 2005). Marantz (2000, 2001) argues that lexical categories are syntactically derived by merging category-neutral roots with category-defining functional heads v, n, and a. A well-worn example is the root ?grow

    which is a „verb‟ in a local relation with the category defining head v:


    (21) v grow-ø

     v ?grow

    By contrast, ?grow in a local relation with n is a noun and in local relation with a is an adjective.

    Recent DM, The Single Engine Hypothesis (Marantz, 2001, 2002 and Arad, 2003, Volpe, 2007), claims that the formation of lexical categories is a syntactic operation and derivational morphology is wholly the product of syntax.

    Croft (2005b: 436), taking the term „construction‟ broadly as used in contemporary

    models of construction grammar (Fillmore, Kay & O‟Connor, 1988; Goldberg, 1995; Langacker, 1987; Croft & Cruse, 2004) and considering „distributional analysis as identifying the mapping

    that is found in the empirical data between constructions and the elements (words, roots, phrases) states:

     … [T]here is a complex, many to many mapping between constructions and elements. Put in this way, it should not be terribly surprising or even radical to a linguist. What is not appreciated is that large, mutually exclusive word classes are incompatible with this view. …. The reason that this conclusion is problematic for many linguists is that it means that one cannot take a „building block‟ approach to analyzing complex syntactic structures. One cannot define

    complex structures as being built ultimately out of a set of atomic primitive categories like nouns‟, „verb‟ and „adjective‟. In fact, we cannot treat „noun‟ „verb‟ and „adjective‟ as

    crosslinguistically universal categories of particular languages.

    Unlike generative linguists which consider lexical categories as primitive atoms from which grammatical structures are built up (Haegemam, 1994; O‟grday, 1997: 164 among others),

    in (Radical) Construction grammar (Croft, 2000, 2001, 2005b) constructions, not categories and relations, are the basic primitive units of syntactic representation from which categories are derived and categories are construction-specific. Categories in a particular language are defined by constructions. The constructions are the primitive elements of the syntactic representations.

5. Discussion and Descriptive and Theoretical Implications

     Mismatching words (predicative nouns and adjectives) as lexemes, with properties referred to in sections {2.2} and {3}, whose lexical category is not fully specified, raise questions for certain linguistics theories and may challenge and undermine their positions concerning lexical categories and parts of speech. The fact that mismatching words denote actions or properties but cannot be considered as verbs or adjectives as shown in (6) and (8) indicates that the lexical and feature-based approaches to parts of speech {4.1} cannot be adequate in classifying Persian word classes. The binary feature approaches as in (20) and other similar positions presented in {4.1}, cannot differentiate mismatching words from the other lexical categories in Persian. This is clearly demonstrated in section {2.2} which illustrate that mismatching words do not morphosyntactically behave like common word classes as in section {2.1} and the criteria of this kind cannot adequately account for Persian lexical word classes.

    Chametzky (2003), quoting from Lebeaux (1988), writes that there are separate representations comprising open class (lexical categories) and closed class (functional categories) respectively. Theta relations are represented by the open class objects and case relations are represented by the close class objects; that is, open class license semantically and close class items create frames into which the theta representation is projected. Bhat (2000: 47) states that:

    "languages differ from one another in the strategies that they use for sentence structures, and because of this, they also differ from one another in the number and type of sentential functions that they need to be expressed; the variations we find in the number and type of word classes that these languages possess can then be regarded as a reflection of this variation in the occurrence of sentential functions."


    Lebeaux (1988) and Bhat's (2000) views are in the same spirit as the DM and Constructionist approaches presented in {4.2}. It seems that these approaches, in comparison to lexical and generative feature-based approaches, are in general more reasonable and adequate in explaining lexical categories.

     Although the existence of both common word classes and mismatching words in Persian may cast doubt on the DM approach to lexical categories, one can argue that mismatching words lacking lexical categorical properties can be taken as lexemes or roots lacking lexical categories which can be specified for their lexical category in the syntax (Marantz, 1997; Harley and Noyer, 1998; Harley and Noyer, 2001, Arad, 2003; Volpe, 2007). Moreover, the fact that mismatching words cannot play syntactic roles without being accompanied by verbal or nominal elements (13-16) and (22-23) may indicate that mismatching words in Persian look like category-neutral roots in the DM theory. Interestingly, mismatching words are not able to function as verbs unless they are accompanied by light verbs. Predicative adjectives, not able to appear in Ezafe Constructions (16), may be used in such Constructions when they are nominalized and the nominal morpheme

    i is prefixed to them (22-23). Therefore, one may claim that mismatching words are similar to roots, and light verbs and i to little v and little n respectively in DM.

     22) kansel-i-ye barnaame

     cancel-Nom- Ez programe

     „Cancellation of the program‟

     23) farâmuš-i -ye matâleb tavsote Ali

     forgetting-Nom-Ez matters by Ali

     “forgetting the matters by Ali”

    Anward (2000: 4) claims that speakers do not set out to acquire parts of speech system. Parts of speech are acquired as language users engage in processes of successive syntagmatic and paradigmatic expansions. In the same spirit, Van Kampen (2005) develops an approach to lexical categories which is generally compatible and in line with the constructionist views referred to in section {4.2}. Van Kampen (2005) argues that universal categories N/V are not applied to content words before the grammatical markings for reference D(eterminers) and predication I(nflection) have been acquired. According to her account, language acquisition takes place in two successive stages and she refers to these stages as 'proto-grammar' and 'real grammar'. In the former, children acquire language-specific operators and category-neutral content words, X?s, and function

    categories such as I and D and lexical categories do not exist. In the real grammar, acquired in a stage after the proto-grammar, the lexical categories such as N, V and A come into existence as a result of the frequent usage of the category neutral content words of proto-grammar in syntactic structures and constructions. The category neutral X?s are categorized as lexical categories such

    as V and N as a result of their usage as the complements of functional categories such as I and D. In Van Kampen (2005), language-specific systems are not acquired due to common UG entrance. Rather, they are highly frequent language-specific bootstraps that coax the child into an adult system that eventually fits UG principles.

    This way of accounting for lexical categories can easily capture the existence of both common word classes and mismatching words in Persian. We assume that the category neural content words in proto-grammar in Persian have the chance to appear either as common word classes as in section {2.1} or mismatching words as in section {2.2}. This means that Persian real grammar, in the sense used in Van Kampen (2005), provides the category neutral content words with two possibilities, either appear as common word classes or as mismatching words in the Light Verb Constructions or Ezafe Construction in the way explained above. In fact we claim that the reason for the existence of mismatching words is the existence of the peculiar, well studied and still controversial Ezafe Constructions (Ghomeshi, 1997; Kahnemuyipour, 2000; Larson and Yamakido, 2005; Karimi, 2007; among others) and Light Verb Constructions (Vahedi-langroudi, 1996; Megerdoomian, 2000, 2001; Folly, Harley and Karimi, 2005; Karimi Doostan, 2005;


    among others). The existence of these two constructions provides a special situation for some argument bearing and content words to function as the component of these constructions without being fully specified for their lexical category. This supports Kampen‟s (2005) position since

    mismatching words remain unspecified with respect to their lexical category is due to the fact that these words, unlike the common word classes presented in section {2.1}, never have the chance to be the complements of function categories such as I and D. In short, the type of constructions in which content words appear in the syntax determines whether they can be categorized as common word classes or mismatching words in Persian.

    Now we return to the question of whether mismatching words in Persian may cast doubt on the universality of lexical categories and their situation as primitive atoms from which syntactic structures are derived. In Radical Construction Grammar „noun‟ „verb‟ and „adjective‟ are not treated as crosslinguistically universal categories of particular languages and primitive units from which syntactic constructions are built. We have also assumed a constructionist approach in which the mismatching words as independent lexical and content words lacking lexical categories have come into existence as a result of the existence of constructions. This can mean that constructions are the sources from which words are derived and the atomicity of lexical categories as primitive units in language acquisition may be called into question. However, lexical and content words in Persian and the way that they fall into common word classes and mismatching words, contra Croft (2005b), does not firmly support the view that lexical categories do not exist crosslinguistically. In fact, mismatching words remain unspecified with respect to their lexical categories and they are not categorized as N, V, or A since they have the chance to discharge their argument structures as the components of Light Verb Constructions and/or Ezafe Constructions which are headed by light verbs and Ezafe Particle respectively which have been claimed to have verbal and nominal features. Karimi-Doostan (1997, 2000, 2005) claims that Light Verbs possess verbal properties and Karimi (2007) shows that the Ezafe particle as a 'n' has a function in DPs similar to the function of v in VPs. Larson and Cheung (2007), following Samiian (1983, 1994), write that Ezafe can be understood as a case phenomenon and integrated with a semantic picture of DP in which nominal modifiers are the complements of D.

    Finally, in cognitive linguistics lexical categories are acquired due to cognitive distinctions (thing/event) and this distinction leads to lexical categories such as N and V and lexical categories correspond to conceptual and semantic realities (Pinker, 1984; Langacker, 1987; among others). Mismatching words and the way they are analyzed in this work indicate that although the cognitive distinctions might be the necessary conditions for the distinction between the lexical categories, they are not the sufficient conditions.

    6. Conclusion


