By Roberta Cole,2014-09-02
    Corpus-based Lexical Semantic Study of Verbs of Doubt:

    1HUAIYI 懷疑 and CA;I in Mandarin

    Mei-chun Liu

    National Chiao-Tung University






1. Introduction

    1.1 Background Assumption

    As lexical semantics has gained increased attention in recent linguistic research, the assumption put forth in Levin (1993:1) serves as a working hypothesis for studies of verbal semantics:

    ―… the behavior of a verb, particularly with respect to the expression and interpretation

    of its arguments, is to a large extent determined by its meaning.

    By examining verb behavior, researchers aim to extract lexical semantic information that is

     1 Id like to express my sincere gratitude to my former part-time assistant Ms. Hui-shan Lin for her valuable contribution to this paper. She has helped in data collection, initial counting and preliminary analysis. Without her assistance, the paper wouldnt be possible.



    grammatically pertinent. Lexical studies of verbs try to answer a fundamental question, that is, in what way is verbal syntax related to verbal semantics? In other words, what are the semantic properties coded in the verbal lexicon which potentially influence the syntactic behavior of verbs?

    The present study reports a case study of verbal semantics with a focus on verbs of doubt in Mandarin. It aims to explore the semantic distinctions coded in the near-synonym pairs: HUAI YI 懷疑 and CA;I . The study takes on a corpus-based approach and adopts the

    module-attribute representational framework proposed in Huang et al. (2000).

    The advantage of using a corpus is that it provides a huge database of natural occurrences from which observational generalization and statistic comparison can be made. The association patterns revealed in the database indicate more clearly the range of variation in the grammatical behavior of verbs. What is characteristic of the corpus-based approach is that distributional tendencies, rather than grammaticality, are taken to be the central concern and evidence for linguistic analysis. As for the target of observation, near-synonym sets prove to be most useful in narrowing the scope of investigation with the benefit of verifiable contrasts in the same semantic field (Grandy 1992). The minimal contrastiveness presumably present between near-synonyms may then serve as a good indicator of meaning components. Tsai et al. (1998) and Chief et al. (2000) convincingly showed the advantages of looking at corpus distribution of near-synonyms.

1.2 The Corpus

    The corpus that is utilized in this study is the Academia Sinica Balanced Corpus (Sinica Corpus), which is the largest database of both written and spoken contemporary Mandarin, containing a total of 5 million words with part-of speech tagging (Chen et al. 1996). The corpus was developed by the CKIP (Chinese Knowledge and Information Processing) group at Academia Sinica, Taiwan, and it is open to the research community through the internet: ftms-bin/

1.3 The Representational Framework

    The representational framework adopted here is called the Module-Attribute



    Representation of Verbal Semantics (MARVS, Huang et al. 2000). The model views each verbal sense as one distinct event structure conveying eventive information which can be

    defined by the composition of modules and attributes. The Event Module represents the overall shape of the event structure. The Role Module represents salient participant roles. Within each module, detailed specifications are represented as attributes: Event-Internal Attributes are features pertaining to the whole event and Role-internal Attributes are features further specifying a participant role. The model can be schematized as follows:

(1) Module-Attribute Representation of Verbal Semantics (MARVS):

    Verb Sense Eventive Information i

     Event Module Role Module

     Event-Internal Attributes Role-Internal Attributes

    The model takes grammar as information-based and lexicon-driven. It makes explicit three related premises. First, verbs with different senses will have different eventive information. The identification of verbal senses is then dependent on the identification of event structures. Second, the eventive information is based on the sum of all attested instances of event realizations. A complex lexical event may never be fully instantiated. Third, the event modules constitute the basic framework for verbal semantics. The classification of information is twofold: structural vs. attributive. There is thus a two-way distinction between modules and attributes. Pre-packaged structural information is viewed as modules while attached attributes provide more detailed description.

    The advantage of the MARVS framework is that by maintaining the module vs. attribute distinction, it allows us to represent finer semantic properties within the same event structure or with the same participant role. Two verbs may share the same type of event structure or the same list of participant roles but differ in event-internal or role-internal attributes.



2. Verbs of Doubt

    In the Sinica Corpus, there is a total of 137 occurrences of CAI, and 369 occurrences ;

    of HUAIYI 懷疑. This study will show that the semantic contrast between the two verbs can be captured by aspectual, event-type distinctions and varied strength of epistemic assertion.

2.1 Initial Observation

    Both HUAIYI and CAI are cognition verbs that mark epistemic modality. In the ;

    corpus, the two verbs are mainly used as predicates taking a complement-theme. Other grammatical uses of the two verbs are found with very little distributional significance. The table in (1) below shows the overall distribution of grammatical functions for the two verbs:

(2) Distribution of Grammatical Functions of HUAIYI and CA;I

    Function HUAIYI CA;I


    +NP 55 (14.9%) 35 (25.5%)

     +Clause 138 (37.4%) 39 (28.5%)

     +zero 133 (36%) 42 (30.7%)

     +DE-comp. 0 13 (9.5%)

    Adjectival 7 (1.9 %) 7 (5.1%)

    Nominalized 34 (9.2%) 1 (0.7%)

    Adverbial 2 (0.5%) 0

    The two verbs share similar verbal patterns, taking either a nominal or a clausal complement. At first glance, they share certain meaning components and can both occur in the following contexts with either an affirmative (3) or a negative (4) complement clause:

(3) 懷疑/ 他是兇手

    wo( HUAIYI / CA;I ta shi xiongsho(u



    I HUAIYI / CA;I he be murderer

    ‗ I suspect that he is the murderer.‘

(4) 懷疑/ 果汁不是純的

     wo( HUAIYI / CA;I guo(zhi bu shi chun de

     I HUAIYI / CA;I juice not be pure DE

     ‗I suspect that the juice is not pure.

    Despite the similarity in the above examples, they behave differently in other structures.

    For example, HUAIYI, but not CA;I, can be modified by a preverbal degree adverbial he(n

     very‘, as in (5), but HUAIYI cannot take a postverbal resultative, as in (6):

(5) 我很 懷疑/*

    wo( he(n HUAIYI /* CA;I

    I very HUAIYI/* CA;I

    Im very doubtful.

(6) /*懷疑 得出

    wo( CA;I / *HUAIYI de chu;;

    I CA;I / *HUAIYI DE out

    ‗I can guess it.‘

    Apparently, the verb HUAIYI is more like a stative predicate, allowing a degree modifier; while CA;I is more like an active predicate, allowing a potential resultative complement.

    The two examples above reflect some preliminary differences in their event structures. In

    the following, their collocational patterns will be closely examined with generalizations on

    their semantic distinctions.

2.2 Collocation and Aspectual Distinction



    As observed in example (4), the verb HUAIYI can co-occur with degree modifiers he(n/shifen/feichang /十分/非常or the evaluative marker zhide 值得

    worth that usually only take stative verbs. The distributional frequency of the two verbs with

    evaluative elements is shown below:

(7) Collocation with evaluative markers

     Degree modifier Evaluative marker

    he(n/shifen/feichade 值得 zhi

    n/十分/非常 very worth

    24 (6.5%) 6 (1.6%) HUAIYI


    0 0 CA;I

    This shows that the two verbs differ fundamentally in event type: HUAIYI is more

    stative, subject to degree modification and evaluation.

    In addition, they display other distinctions in aspectual properties, as evidenced from their

    collocation with various aspect-marking elements. HUAIYI can be preceded by the inception verb kaishi( ‗start‘, denoting an inchoative change, but CA;I cannot:

(8) 她開始 懷疑/* 果汁到底是不是純的.

    ta kaishi( HUAIYI /* CA;I guo(zhi daodi( shi bu shi chun de

    she start HUAIYI/ * CA;I juce to bottom be not be pure DE

    She started wondering if the juice was pure.

    Below is their distributional frequency with two inchoative-marking devices, preverbal

    kaishi( ‗start‘ and the postveral qi(lai ‗up‘. It is clear that HUAIYI allows an event

    2focus on the starting point. An event focus refers to the profiled event component of a

     2 It seems to be intuitively true that CA;I may also occur with the verb kaishi( indicating an

    inceptive-progressive aspect, as in the imperative kaishi( cai! You may start guessing. This use is, however, not found in the corpus. Given that CA;I is essentially an activity verb, it is potentially compatible with inceptive




    complex event (For details, see Liu 1999).

(9) Collocation with inchoative marking

     Inceptive verb (lai Inchoative qi

    kaishi (

    14 (3.5%) 1 (0.2%) HUAIYI


    0 0 CA;I

    However, with regard to the marking of an endpoint, CA;I can be followed by the adverbial WAN ‗finish‘, denoting the completion of an event, but HUAIYI cannot:

(10) 你到底 /*懷疑 完了沒?

    ni daodi CAI /* HUAIYI wan le mei ((;

    you to bottom CA;I /* HUAIYI finish LE no

    Have you on earth finished guessing?

    Moreover, while CA;I can co-occur with a durational phrase of time, HUAIYI cannot:

(11) 他已經 /*懷疑了三天三夜了

    ta yi(jing CA;I /* HUAIYI le san tian san yie le

    he already CA;I /* HUAIYI LE three day three night LE

    ‗He has already been guessing for three days.‘

    The above examples (10)-(11) show that the event structure of CA;I involves a process, which may be bounded by an endpoint. When the endpoint is profiled, it predicates the

    stative result of guessing, the acquisition of a certain weak belief, as exemplified in (3)-(4) above. The meaning extension from a process to a stative prediction seems to be characteristic

    of verbs of cognition. For example, the process of thinking may result in obtaining a particular thought, expressed as a clausal complement to the verb think.



    The distributional differences between HUAIYI and CA;I as discussed above indicate

    clearly their distinction regarding aspectual composition, which can be represented with different event modules in MARVS: The verb HUAIYI may co-occur with a degree or

    evaluative adjunct, indicating its stativity; and it allows a predicative focus on the starting point of the state, indicating a change of state or inchoative state. On the other hand, CA;I may

    be used with a perfective marker or a durative phrase, typical of an activity situation (cf.

    Smith 1991). The event CA;I refers to is a potentially on-going process that may have a final point, a resultative state of the cognitive activity:

(12) Even Modules of HUAIYI and CA;I:

    HUAIYI: inchoative state ____

    CA;I: bounded process //////

    The above distinction in the event module also bears implications on the role Module: the subject of HUAIYI is more like an experiencer, while CA;I takes a volitional agent.

2.3 Presupposition and Epistemic Distinction

    Another important and interesting distinction between the two verbs has to do with their epistemic marking capacity and contextual presupposition. Both verbs indicate low epistemic assertion of a proposition, distinct from strong epistemic assertion verbs such as renwei

    think, or xiangxin believe. The extremely weak assertional strength in the use of

    HUAIYI may even lead to the opposite a denial of a proposition that is contextually

    presupposed. The verb HUAIYI expresses such a strong doubt that it may implicate a

    denial of the truth-value of the proposition. Take (3) as an example, repeated here in (13): wo(

    HUAIYI ta shi xiongsho(u 我懷疑他是兇手 I doubt that he is the murderer. The

    3sentence is potentially ambiguous. Besides the possible reading as making a weak assertion,

    similar to the use of CA;I, in saying I doubt that he is the murderer, one can actually intend

    to deny a commonly-held presupposition that he IS the murder, given an appropriate context.

     3 Out of context, the sentence (Example 13) may have two different interpretations. It may be used to claim a weakly asserted belief I think he is the murder or to challenge such a presupposition. The point here is that the



As shown in (13), it is possible to use HUAIYI to challenge a statement of presupposition:

(13) Challenge to a presupposition

     Statement of a Presupposition

懷疑 [ 他是兇手 ]

    wo( HUAIYI [ta shi xiongshou]. (

    I HUAIYI he be murderer

    ‗I doubt that he is the murderer. (Contrary to the entertained belief, I dont think he is the


    The above interpretation is made possible because the verb HUAIYI has such low

    epistemic strength that it may function as a negator to challenge a pre-existing assumption. The key to this pseudo-negation use is the existence of a contextual presupposition. Without any contextual presupposition, HUAIYI serves to make a weak assertion with very low

    epistemic certainty; it functions to make an irrealis assertion, as defined in Givon (1993: 170):

    The proposition is weakly asserted as either possible or likely; but the speaker is not ready to back it up with evidence or other strong grounds; and a challenge from the hearer is readily entertained or even explicitly solicited.

    However, when occurring with a clearly inferred contextual presupposition, HUAIYI

    serves to mark it with such a low degree of certainty that it is nearly negated. This particular use of HUAIYI resembles a negative assertion in its presuppositional requirement. Givon

    (1993:189) has made it clear that a negative assertion is made on the tactic assumption that

    the hearer either has heard about, believes in, is likely to take for granted, or is at least familiar with the corresponding affirmative proposition. This observation is crucial to the

    understanding of the epistemic distinction between the two verbs.

    possibility of using HUAIYI to challenge a presupposition reveals its peculiar status in epistemic marking.



3. MARVS Representation of HUAIYI and CA;I

    Besides their module distinction in event types, the two verbs also differ in event-internal characteristics: while they both encode very low epistemic strength, HUAIYI may mark an

    assertion as extremely weak that it almost negates or counters the proposition. Below is a schematic representation of the semantic distinctions between the two verbs using the Module-Attribute model:

(14) MARVS for HUAIYI 懷疑and CA;I :

    Verb Event Module Role Module

    Event-Internal Attributes Role-Internal Attributes

     HUAIYI inchoative state ______

     | 懷疑


    [irrealis assertion] [presuppositional]


     CA;I bounded process //////



    [low epistemic strength]

4. Conclusion

    Verbs of doubt are semantically a subset of verbs of cognition. The verbs in this class normally involve two kinds of participant roles: the Initiator of Cognition (Experiencer or Agent) and the Content of Cognition (Theme). Different verbs encode a different event structure that selects a different view of aspectual composition and different details of the participant roles. Moreover, as can be seen from the above discussion, with this particular subset, the epistemic assertional strength encoded in a particular verb may be the source for potential pragmatic implication. The lexical semantic distinctions made in this study can also be further studied as to their interaction with pragmatic implicature.

    The importance of lexical semantics has been highly recognized in recent years as


