By Rita Foster,2014-02-23 19:43
7 views 0

Knowledge of English Collocations: An Analysis of

    Taiwanese EFL Learners

LI-SZU HUANG, University of Texas at Austin

    This research investigated Taiwanese EFL students’ knowledge of English

    collocations and the collocational errors they made. The subjects were 60 students

    from a college in Taiwan. The research instrument was a self-designed Simple

    Completion Test that measured the subjects’ knowledge of four types of lexical

    collocations: free combinations, restricted collocations, figurative idioms, and

    pure idioms. The results indicated that, for the subjects, free combinations created

    the least amount of difficulty, whereas pure idioms were the most challenging.

    Additionally, they performed about equally well on restricted collocations and

    figurative idioms. In general, the subjects deviant answers demonstrated their

    insufficient knowledge of English collocations. It is concluded that EFL learners’

    errors in collocations can be attributed to negative L1 transfer.


    Research in the field of TESL/TEFL (teaching English as a second/foreign

    language) has recognized collocational knowledge as a crucial part of

    phraseological competence in English (Fontenelle, 1994; Herbst, 1996; Lennon,

    1996; Moon, 1992). The syntagmatic relations of a lexical item help define its

    semantic range and the context where it appears. Awareness of the restrictions of

    lexical co-occurrence can facilitate ESL/EFL learners’ ability to encode language

    (Nattinger, 1989; Seal, 1991). It also enables them to produce sentences that are

    grammatically and semantically acceptable. They thus can conform to the

    expectations of academic writing or speech communication (Bahns, 1993; Bahns

    & Eldaw, 1993; Farghal & Obiedat, 1995; Granger, 1998).

    Research on ESL/EFL learners vocabulary development has mainly focused on the knowledge and production of individual lexical items. In contrast,

    researchers have devoted scant attention to knowledge of collocations. As Bahns

    and Eldaw indicated in an empirical study (1993), EFL students did not acquire

    collocational knowledge while acquiring vocabulary. Instead, their collocational

    proficiency tended to lag far behind their vocabulary competence. Among the

    small number of studies on learners performance in English collocations, the majority have observed the difficulty of learners whose native languages are

    similar to English. Investigations of the collocational knowledge of learners who

    have a very different linguistic systemfor example, Chinese or Japanese

    remain scarce. Research on the difficulty that learners from different L1

    backgrounds encounter in acquiring English collocations would prove valuable

    and would enable teachers to identify effective ways of promoting

    phraseological competence in their learners.


    To obtain a holistic picture of the issues related to the acquisition of English collocations by ESL/EFL learners, this section reviews the literature on

    the topics of (a) the categorization of collocations, (b) factors influencing

    ESL/EFL learners performance in collocations, and (c) learners strategies in

    dealing with collocations.

Categorization of Collocations

    Some sequences of lexemes can co-occur due to an individual speakers

    choice of words, but others appear in a predictable way. When the co-occurrence

    of lexical items has a certain degree of mutual predictability, the sequence of

    these items is considered a collocation (Cruse, 1991; Jackson, 1989). As Crystal

    (1995) has pointed out, the collocation of particular lexemes is not necessarily

    based on the subject’s knowledge of the world. Rather, what is required for one

    item to attract another is, to some extent, dependent on the intuitive

    understanding of a native speaker. The predictability of certain word

    combinations can be weak; for instance, dark is an item with a diverse range of

    collocates. In contrast, an item such as rancid tends to have strong predictability

    because it can collocate with only two or three items. Researchers generally agree

    that different types of collocations should be placed on a continuum (Fontenelle,

    1994; Herbst, 1996; Howarth, 1998a; Nattinger & DeCarrico, 1992; Palmer, 1991).

    They indicate that, simply by relying on the meanings of collocational

    constituent elements, it is hard to draw a clear distinction between collocations

    that are either predictable or not.

    As far as the dividing points on the continuum are concerned, researchers have yet to reach an agreement. Nonetheless, the criteria for

    categorizing different types of word combinations basically include semantic

    transparency, degree of substitutability, and degree of productivity (Carter, 1987;

    Howarth, 1998b; Nattinger & DeCarrico, 1992). On the one end of the

    collocational continuum are free combinations with the highest degree of

    productivity, semantic transparency, and substitutability of items for their

    constituent elements. On the other end are idioms that are the least productive,

    the most opaque in semantics, and the most frozen in terms of substitutability of

    elements. Between these two extremes are different types of restricted


    At present, we still lack a clear, non-controversial and all-embracing definition of collocation (Fontenelle, 1994). Consequently, researchers tend to use

    different terms and scopes to describe the syntagmatic relationships between

    lexical items (Granger, 1998; Moon, 1992). The current study adopts Howarths

    (1998b) categorization model of lexical collocations because the model provides a

    thorough explanation of the classification criteria and easy-to-follow examples.

    In the model, the collocational continuum contains four categories of collocations:

    (a) free combinations, (b) restricted collocations, (c) figurative idioms, and (d)

pure idioms. A free combination derives its meaning from composing the literal

    meaning of individual elements, and its constituents are freely substitutable. A

    typical example provided by Howarth is blow a trumpet. A restricted collocation is more limited in the selection of compositional elements and usually has one

    component that is used in a specialized context, e.g., blow a fuse. For idioms that are semantically opaque or highly frozen, Howarth further divides them into

    figurative and pure idioms. While a figurative idiom has a metaphorical meaning

    as a whole that can somehow be derived from its literal interpretation, a pure

    idiom has a unitary meaning that is totally unpredictable from the meaning of its

    components. The example Howarth gives for the two types are blow your own trumpet and blow the gaff, respectively.

Factors Influencing Performance in Collocation

    Recent empirical studies have identified several factors that may influence

    learners’ performance in producing collocations. These factors include semantic

    fields, meaning boundaries, and collocational restrictions. The semantic field of a

    lexicon is determined by its conceptual field. Examples of conceptual fields

    include color, kinship and marital relations. (Allan, 2001). Biskup (1992)

    examined Polish and German EFL learners performance in English collocations.

    He concluded that the wider the semantic field of a given lexical item, the more

    L1 interference errors it might trigger. For example, a number of subjects

    provided *lead a bookshop for the target collocation run a bookshop, which was clearly an instance of L1 interference. In the same vein, the more synonyms an

    item had, the more difficulties learners encountered in producing a restricted

    collocation. Lennon (1996) also pointed out the reasons accounting for learners

    erroneous use of high frequency verbs such as put, go, and take. The main reason lay in these verbs’ rich polysemy and syntactic complexity. As they formed phrases with prepositions, these verbs created collocational restrictions that

    required special attention to their collocational environments. These lexical

    properties surely created different degrees of difficulty for learners.

    The second factor concerns the influence of learners native language. Because of the commonality of some human situations, different languages have

    parallel fixed expressions that are syntactically and semantically similar (Moon,

    1992; Teliya, Bragina, Oparina, & Sandomirskaya, 1998). Due to cultural

    specificity, however, certain elements embedded in these expressions differ

    across languages. For example, English and Russian have a restricted collocation

    to express the process of forming a person’s character. The English collocation is

    to mold someone’s character, whereas the Russian expression vuikovuivat’

    kharakter means, literally, to forge someone’s character. This Russian collocation

    is associated with a blacksmith hammering at a metal object to give it firmness

    and hardness. Though the English expression is also connected with a firm object,

    it emphasizes the idea of giving shape to an originally shapeless mass (Teliya et

    al., 1998). These similar but distinct expressions may cause a negative transfer

from learners’ L1 (Granger, 1998). L1 influence is most prevalent when learners

    perform translation tasks. Lacking collocational knowledge, learners rely heavily

    on the L1 as the only resource and thus do better in those collocations that have

    L1 equivalents than those that do not (Bahns, 1993; Bahns & Eldaw, 1993; Farghal

    & Obiedat, 1995).

    The third factor has to do with individual learners’ collocational

    competence. Granger (1998) and Howarth (1998a), by comparing the writing

    corpora of ESL/EFL learners and native English speakers, both reported that

    these learners generally demonstrated deficient knowledge of English

    collocations. Compared with their native-speaker counterparts, the ESL/EFL

    learners produced a lower percentage of conventional collocations but a higher

    percentage of deviant combinations. These learners tended to have a weak sense

    of the salience of collocational patterns. Other researchers such as Bahns and

    Eldaw (1993) and Farghal and Obiedat (1995) reported likewise. They found that

    L2 learners had a big gap between their receptive and productive knowledge of


    Teliya et al. (1998) identified culture-related knowledge as another dimension embodied in the issue of lexical competence. They argued that the use

    of some lexical collocations was restricted by certain cultural stereotypes.

    Metaphorical collocates, for instance, served as clues to the cultural data

    associated with the meaning of restricted collocations. Lack of cultural

    competence might be responsible for learners failure to acquire such culturally-

    marked collocations. This was especially true in the case of idioms because their

    metaphorical meanings were highly connected with cultural connotations and

    discourse stereotypes.

    Idioms represent a unique form of collocation, and several factors affect their comprehension and production. These include the context in which the

    idioms are situated, the meanings of the constituents of an idiom, and learners

    conceptual knowledge of metaphors and figurative competence (Gibbs, 1995;

    Hamblin & Gibbs, 1999; Levorato, 1993). Idioms are perceived to be more

    appropriate by native speakers when the context of the idiom is aligned with the

    intended meaning. Gibbs (1995) argued that for every analyzable idiom its

    salient partfor example, the main verbcould determine the meaning of the

    entire idiomatic expression. Based on the outcomes of a series of studies,

    Hamblin and Gibbs (1999) concluded that learners’ figurative competence would

    also influence their comprehension of idioms.

Strategies in Dealing with Collocations

    Due to insufficient knowledge of collocations, English learners may adopt certain strategies to produce collocations and thus create certain types of errors.

    The strategy used most commonly is transfer in which learners rely on L1

    equivalents when they fail to find the desired lexical items in the L2. The Polish

    subjects in the study by Biskup (1992) mentioned above, for instance, were aware

of the significant difference between their L1 and English in terms of linguistic

    structure. Hence, their error types reflected an extension of L2 meaning on the

    basis of L1 equivalents. On the other hand, the group of German learners was

    inclined to assume formal similarities between their L1 and English. As a result,

    they made errors such as language switches and blends. The transfer strategy

    may also reflect the learners’ assumption that there is a one-to-one

    correspondence between their L1 and L2. As Farghal and Obiedat (1995) pointed

    out, positive transfer occurred when the target collocations matched those in the

    L1, while negative transfer appeared when no corresponding patterns could be

    found in the L1.

    The second strategy is avoidance (Bahns & Eldaw, 1993; Farghal & Obiedat, 1995; Howarth, 1998). Second language learners may avoid the target

    lexical items because they fail to retrieve the appropriate items of which they

    have passive knowledge. As a consequence, they alter the intended meaning of

    the collocations (Bahns & Eldaw, 1993; Farghal & Obiedat, 1995; Howarth, 1998b).

    The third strategy often used by learners is paraphrasing, or using synonyms. Learners may substitute the target item with a synonymous

    alternative and use paraphrasing to express the target collocations with which

    they are not familiar. For example, the German learners in Biskup’s study (1992)

    adopted more creative strategies than the Polish learners. They thus provided

    more descriptive answers such as substituting crack a nut with break a nut open.

    Also noteworthy is the study by Farghal and Obiedat (1995), who investigated the use of synonyms by Arabic EFL learners. The study revealed

    that the subjects’ heavy reliance on the open choice principle for item selection

    led to deviant and incorrect collocations. Additionally, the researchers found that

    the more collocations learners acquired, the fewer paraphrases they used in their

    L2 production. In this case, paraphrasing was generally used as an escape-hatch

    that helped communication proceed.

    There are of course other strategies frequently adopted by learners. For example, learners may experiment by creating a collocation that they think is

    substitutable for the target one (Bahns & Eldaw, 1993; Granger, 1998). Granger

    (1998) noticed in her corpus of French essays that learners created collocations

    they considered to be acceptable such as ferociously menacing and shamelessly

    exploited. Apparently, these unconventional word combinations were a result of

    learners’ creative invention.

    Howarth (1998b) examined the errors in the corpus of non-native writers and identified some other strategies including analogies and repetition. These

    writers created collocations based on a familiar L2 collocation. For instance, they

    drew an analogy between adopt a method and adopt an approach. However, this

    strategy might also lead to the overgeneralization of collocability. An example of

    this would be adopt ways, an idiomatic expression which would likely have

    marginal usage among non-native speakers. The non-native writers in Grangers

    (1998) study tended to use a limited number of collocations repeatedly such as

the combination of very with a variety of adjectives. The strategy of repetition

    was particularly favored when learners did not possess sufficient knowledge of



    The preceding review of learners strategies provides insights concerning

    how they deal with English collocations. It also provides an understanding of the

    processes they go through to attain L2 collocations. Some questions naturally

    arise: To what extent can these strategies be generalized for learners from

    different L1 backgrounds? What kinds of difficulties do learners from different

    linguistic backgrounds encounter in dealing with English collocations? The

    purpose of this research, therefore, was to specifically investigate Chinese EFL

    learners knowledge of different types of English collocations. These include free combinations, restricted collocations, figurative idioms, and pure idioms, as

    proposed by Howarth (1998b).

    It was hypothesized that the degrees of difficulty for learners were subject to an items’ position in the collocational continuum, starting with free

    combination as the easiest type and pure idiom the most difficult. In addition,

    the research investigated critically the errors the learners produced in the target

    task. An analysis of their responses would reveal their difficulty in acquiring

    English collocations and uncover the strategies they used to deal with problems.

    It was expected that an understanding of learners’ strategies would shed light on

    approaches for teaching collocations.



    Sixty students from a college in southern Taiwan were recruited as the subjects of the study. Of these sixty students, 19 were male and 41 were female,

    and they ranged from 19 to 22 years of age. Majoring in medical science and

    technology, these students took English as a mandatory course for the

    completion of their degree. Before they entered this college, they had received at

    least six years of English instruction by the time they graduated from high



    The research instrument was a self-designed Simple Completion Test (SCT) that measured the subjects’ knowledge in four types of lexical collocations:

    free combinations, restricted collocations, figurative idioms, and pure idioms

    (Howarth, 1998b). The test consisted of 40 items in the form of free-response with

    ten items in each collocational category. Each item contained two or three

    sentences that provided a context in which a specific collocation or idiom about

    food or animals was embedded. By referring to the sentential context, a subject

was required to fill in an appropriate word to complete the target collocation or

    idiom. Most sentences involved in the SCT were adapted from Bookers Longman

    active American idioms (1994).

    The 40 test items were distributed to four sections according to their roles

    as a part of speech. Each section comprised separate test items falling into the

    four types of collocations previously mentioned. Section A required subjects to

    fill in an appropriate verb, Section B an adjective, and Section C a noun about

    food. Target items in Section D were nouns related to animals. Please refer to the

    Appendix for the complete list of test items. Examples for each type of lexical

    collocations are given below. (The number in front of each example is its item

    number in the SCT.)

    Free combination - 11. Those boys and girls don’t ___ orange juice. They

    prefer something special, like pineapple juice or punch. (Fill in a verb.)

    Restricted collocation - 25. They also provide ___ drinks at the party for

    those who dont drink alcohol. (Fill in an adjective.)

    Figurative idiom - 34. A lazy person always gives the excuse that working is not his cup of ___. (Fill in a noun about food.)

    Pure idiom - 47. The Browns bought a very cheap house, but later they spent a lot of money repairing it. We all think that they bought a ___ in a poke.

    (Fill in a noun about animal.)


    The SCT was administered in the classroom where regular instruction

    for the subjects took place. Each subject was allowed sufficient time to work

    individually on the test questions. It took about 35 minutes for all the subjects to

    finish the test. Before the test started, the researcher provided directions in

    Chinese and encouraged the subjects to answer each question or take educated

    guesses if they were unsure of the answer.

Data Collection and Analysis

    The subjectsanswer sheets were collected and analyzed using both quantitative and qualitative paradigms. The correct answers provided by each

    subject were first marked. Special consideration in scoring was given to test

    words under the categories of free combinations and restricted collocations. An

    answer that showed a correct choice of lexicon but had wrong inflections was

    judged to be correct. Note the example below.

    It is possible that after several decades, children may not know how a

    pig ___. This may happen because they have never seen a pig.

In this case, answers such as walks, walk, walking were all counted as correct

    because the focus of the SCT was on the correct choice of collocates. The response

word walk can collocate perfectly with pig in this sentence, and thus the

    inflectional errors in verbs or numbers of nouns were ignored.

    The criteria applied to items under the categories of figurative and pure

    idioms were slightly different. Look at the following example:

    We ___ a whale of time at Pauls birthday party yesterday. It was really


The answers had, have, has were all counted as correct. The choice of the verb to

    have was correct for this idiom and the error in verbal inflection did not affect the

    meaning of the idiom. Accordingly, the above responses were all considered

    correct. This principle does not apply to the following example.

    Ten years ago, the streets in Chicago were dirty and public services were

    awful. The city had really gone to the ___. But now its much better.

    In this situation, the word dogs was the only correct answer while the alternative word dog failed to fit this pure idiom, a type of collation that is completely frozen.

    No freedom was allowed for a subject to change plurality to singularity in this


    In the quantitative analysis, the number of correct responses for each

    test word was counted, as were the numbers of blank responses and deviant

    answers. Descriptive statistics were then generated to compare subjects

    performance in each category and observe the relative difficulty of different

    categories. The mean under each category represented the average number of

    subjects who answered the test items in the category correctly. The average

    number of blank responses in each category was also counted because it

    indicated the difficulty level perceived by the subjects. Since students were

    encouraged to answer each test item without leaving any blanks, the blank

    responses may suggest that they were unable to provide even an educated guess

    due to the difficulty of the item. Another indicator of item difficulty is the

    number of variations in subjects incorrect answers. It was suspected that subjects would provide more variations for the items they perceive more


    In addition, a qualitative paradigm was used to analyze the collocational

    clusters subjects provided for each category. This application aimed to reveal

    which words caused confusion in terms of their collocability and which lexical

    collocations were especially challenging to the respondents.


    Table 1 displays the average percentage of correct responses for each

    category. The mean of the free combination category is dramatically higher than

    that of the other three. The category of pure idioms, as predicted, has the lowest

mean. The mean of figurative idioms is slightly higher than that of restricted

    collocations, but subjects’ performance in the former type is more deviated from

    the mean. The results have partly confirmed the hypothesis that free

    combinations appear to be the easiest to deal with, whereas pure idioms are the

    most challenging. Figurative idioms were expected to be more difficult than

    restricted collocations. Surprisingly, however, they created the same degree of

    difficulty for the subjects.

Table 1 Descriptive statistics of the subjects performance in four categories (N = 60)

     Free Restricted Figurative Pure combinations collocations idioms idioms


Mean 49.20 8.10 8.60 4.0

    SD 7.51 7.67 11.08 10.23

     The same tendency emerged when the researcher examined the average

    numbers of subjects deviant answers (exclusive of the correct answers provided)

    and blank responses. As shown in Table 2, the subjects gave considerably fewer

    deviant answers and blank responses for free combinations than in the other

    three categories. The figures in the categories of restricted collocations and

    figurative idioms do not show a great difference, indicating that subjects faced an

    equal level of difficulty for these two categories. Among the four types, pure

    idioms triggered the most deviant answers and blank responses. Although

    subjects were encouraged not to skip any items by engaging in guessing, on

    average approximately one fifth of the subjects failed to provide at least a guess

    for at least one of the pure idioms.

    For restricted collocations and both figurative and pure idioms, the subjects created a large number of variations of incorrect answers. The enormous

    amount of varieties of deviant answers implies their lack of collocational


Table 2 Average numbers of blank responses and variations of incorrect answers in four

    categories (N = 60)

     Free Restricted Figurative Pure combinations collocations idioms idioms


    Blank responses 1.7 7.6 9.7 12.8

Numbers of variations

    of incorrect answers 7.6 23.3 23.2 26.6

    An analysis of subjects collocational errors in each category suggests that

    test items created different degrees of difficulty for the subjects. For all test words

in free combinations, more than two thirds of the subjects answered correctly

    except for items 14 (how a pig ___) and 22 (___ food). Only 37 out of 60 responded correctly for these two items. For item 14, some subjects provided deviant

    answers that did not comply with the syntactic structure of the indirect question

    starting with how, e.g., is, like. Item 22 required the subjects to fill in an

    appropriate adjective that collocates with food. Many of the deviant answers, however, contained lexical items of other parts of speech and spelling errors. As

    for the category of restricted collocations, no subjects correctly answered items 19

    (milk their cows) or 27 (soup too thick/solid/stiff to stir). Items 18

    (henhatch/produce eggs), 33 (food stamps), 17 (make/propose/drink a toast), and 25

    (soft/non-alcoholic drinks) were also very difficult, as fewer than ten subjects

    responded appropriately.

    The subjects had an equally unsatisfactory performance in figurative

    idioms. None of them could give a correct answer for items 110 (smell a rat), 210

    (a dark horse), 211 (beat a dead horse) and 45 (a bull in a china shop). By contrast,

    more than half of the subjects correctly answered item 43 (a paper tiger). Similarly,

    their performance in item 34 (his cup of tea) was also remarkable, with 22 out of 60 subjects providing the correct answer. Pure idioms, as expected, proved to be

    extremely demanding for the subjects, as none of them managed to provide a

    correct answer for half of the test items. The other half of the test items with the

    exception of item 111 (had a whale of a time) was also difficult, as only one or two subjects came up with the correct answers. Thirty-three subjects out of 60

    provided a correct choice for item 111, though they made a great number of

    inflectional errors. The reason may be that these subjects made an analogy of this

    idiom with have fun or have a good time. Otherwise, they would not be able to answer it correctly because pure idioms are frozen in terms of lexical collocability

    and meaning fixation. On the other hand, their deviant answers may, to a great

    extent, also have resulted from guessing. Taking this into account, the researcher

    did not further analyze their collocational errors in pure idioms.

    In comparison with pure idioms, the subjects deviant answers for restricted collocations and figurative idioms may shed light on their knowledge

    of collocations since these two categories allow a certain degree of flexibility in

    lexical combinations. For this reason, a qualitative approach was utilized to

    analyze the collocational errors the subjects created in these two categories. Table

    3 shows the deviant answers for each test item. Only test items involving more

    than 5 respondents are displayed.

Table 3

    Correct and deviant answers for restricted collocations and figurative idioms (N = 60)

Report this document

For any questions or suggestions please email