A corpus-based study of the phraseological behaviour of abstract
nouns in medical English
Natalia Judith Laso Martín
University of Barcelona
It has been long acknowledged (Carter 1998; Williams 1998; Biber 2006; Hyland 2008) that writing a text not only entails the accurate selection of correct terms and grammatical constructions but also a good command of appropriate lexical combinations and phraseological expressions. This assumption becomes especially apparent in scientific discourse, where a precise expression of ideas and description of results is expected. Several scholars (Gledhill 2000; Flowerdew 2003; Hyland 2008) have pointed to the importance of mastering the prototypical formulaic patterns of scientific discourse so as to produce phraseologically competent scientific texts.
Research on specific-domain phraseology has demonstrated that acquiring the appropriate phraseological knowledge (i.e. mastering the prototypical lexico-grammatical patterns in which multiword units occur) is particularly difficult for non-native speakers, who must gain control of the conventions of native-like discourse (Howarth 1996/1998; Wray 1999; Oakey 2002; Williams 2005; Granger & Meunier 2008).
This paper aims to analyse native speakers‟ usage of abstract nouns in medical English,
which will contribute to the linguistic characterisation of the discourse of medical science. More precisely, this research study intends to explore native speakers‟ prototypical lexico-
grammatical patterns around abstract nouns. This analysis is based entirely on corpus evidence, since all collocational patterns discussed have been extracted from the Health Science Corpus
(HSC), which consists of a 4 million word collection of health science (i.e. medicine, biomedicine, biology and biochemistry) texts, specifically compiled for the current research study. The exploration of the collocational behaviour of abstract nouns in medical English will serve as a benchmark against which to measure non-native speakers‟ production.
Corpus-based studies have drawn the attention to the study of the lexicon as the central principle in language and have also emphasised the interconnections between lexis and syntax (Francis 1993; Hunston & Francis 2000; Wray 2002, among others). Linguistic investigation of naturally-occurring data has revealed that language is organised in terms of a lexico-grammar and, thus, it consists of recurrent patterns of words (Renouf & Sinclair 1991; Sinclair 1991; Altenberg & Tapper 1998; Stubbs 2001). The study of how words are used to make meanings; in other words, how meaning maps onto use, is one of the key concerns in current research in phraseology.
Phraseological empirical studies have confirmed the important role of prefabricated expressions in the textual development of meaning (Gledhill 2000b; Kaszubski 2000) and have also highlighted the need for further research on the phraseological conventions characteristic of specialist genres. As Kaszubski (2000) points out:
Word combinations are inextricably related to the layer of style –the
appropriateness and/or naturalness of selection and co-occurrence of
items, subject to genre-sensitive restrictions and conventions. Thus, in
order to compare aspects of lexical use, one is bound to focus attention
on phraseology.” (Kaszubski 2000:2)
Specifically, in the field of Language for Specific Purposes (LSP), corpus linguistics has established itself as a fundamental methodological tool for determining the defining linguistic features of different discourse communities (Swales 1990; Lee 2001; Gledhill 2000; Lee & Swales 2006).
This research study intends to explore native speakers‟ prototypical lexico-grammatical
patterns around abstract nouns, illustrated by a case study on the phraseological behaviour of the noun contribution in medical research articles.
Phraseology in specialised registers
[T]he picture we have of the research article is far from
complete. That picture suggests that there are certain characteristics of
RAs [research articles] which, by and large, tend to occur and recur in
samples drawn from an extensive range of disciplines (…) However, it
remains the case that RAs [research articles] are rarely simple
narratives of investigations. Instead, they are complexly distanced
reconstructions of research activities, at least part of this reconstructive
process deriving from a need to anticipate and discountenance negative
reactions to the knowledge claims being made.” (Swales 1990:174-175)
Swales‟ account of the genre of the scientific article points to the existence of a number of
conventions which contribute to define and characterise the scientific discourse. Textual analyses of the genre are hence extremely important so as to identify the phraseological structures characteristic of scientific English.
Genre analysis is concerned with a particular subset of language; that is, a specific language practice, characterised by a number of linguistic features and phraseological conventions. It can be therefore claimed that genres make use of different ways of expressing meaning (Hunston 2002:178). This assumption is intimately linked with the concept of local
grammar (Gross 1993; Barnbrook & Sinclair 1995; Hunston & Sinclair 2000), which consists of a description of particular areas of language (e.g. the analysis of the phraseology characteristic of medical discourse), rather than the language as a whole (Bednarek 2007).
The current treatment of phraseology in specialised registers acknowledges the need for corpus-based studies of the prototypical lexico-grammatical patternings and discourse functions 1. of lexical phrases across disciplines (cf. Carter 1998; Oakey 2002a/b; Hyland 2008)According to Hyland (2008),
Gaining control of a new language or register requires a sensitivity to
expert users‟ preferences for certain sequences of words over others
that might seem equally possible. (Hyland 2008:5)
Thus, it seems that getting familiar with the specific phraseology of the register of a discourse community will imply not only a better knowledge of the genre but also an enhanced competence in the process of writing and reading in specialised registers. As Williams (2002) claims:
In order to understand texts, we must look at them closely to find the
lexico-grammatical strategies that they adopt to assist communication
within a specialised community. (Williams 2002:60)
Studies in genre analysis, such as Swales‟ (1990) investigation of academic and research settings; Bhatia‟s (1993) exploration of genre in professional contexts and Gledhill‟s (2000b)
work on collocations in science writing, suggest that there are significant textual variations in 2different specific-domains and genres. These findings have underlined the convenience of
analysing specialist corpora so as to find out the “kinds of language data which particular
communities of users might encounter and which will inform their use.” (Hyland 2008:8).
Despite the abovementioned growing interest in the formulaic aspects of language knowledge in specialised registers, Gledhill (2000b) observes that in comparison with linguistic analyses based on general English corpora, less work has been conducted on specialised language to date. More specifically, he claims that there is a noticeable shortage of linguistic corpus-based studies in the field of phraseology in scientific discourse.
There are, however, some remarkable exceptions. Several worth mentioning studies focusing on scientific articles as a whole are, for example, Myers‟ (1989) account of the 3pragmatics of politeness involved in scientific papers; Master‟s (1991) study of active verbs
with inanimate subjects in scientific English and Banks‟ (1994) analysis of the organisation of different clause types in the scientific journal article.
Other studies on the phraseology characteristic of scientific discourse, on the contrary, have centred their investigation either on a specific domain within the field of scientific English or, to a lesser extent, on the different (sub)sections of the scientific article. (Adams-Smith 1984; Salager-Meyer 1994; Gledhill 1995/1996 and Williams 1996, to name but a few).
All the above literature has proven extremely useful in the characterisation of science writing, and has also confirmed Swales‟ (1990) assertion of the complexity of the scientific research article: “the RA [research article] is anything but a simple genre” (Swales 1990:128).
Several textual properties of the scientific discourse such as modality (Huddleston 1971; Widdowson 1979; Adams-Smith 1984; Salager-Meyer 1992; Banks 1994; Gledhill 2000a/b), hedging (Myers 1989; Swales 1990; Salager-Meyer 1994; Banks 1994; Varttala 1999; Gledhill 2000), the use of the passive and the anticipatory it-pattern so as to disguise authorial
interpretations (Huddleston 1971; Swales 1990; Banks 1994; Biber et al. 1998/1999; Hyland 2008), an attested tendency to use grammatical metaphor (Salager-Meyer 1992; Banks 1994; Halliday 1998; Gledhill 2000a/b) and a high use of abstract nouns in the expression of processes and methods (Halliday 1993; Flowerdew 2003) have been identified as defining rhetorical devices which contribute to a great extent to the development of scientific discourse (Luzón 2000; Gledhill 2000b; Noguchi 2006; Hyland 2008).
If, as already discussed in the reported literature, much of the language involved in scientific discourse is “highly stereotypical in nature” (Gledhill 2000a:116), it seems of
paramount importance that members of that discourse community become familiar with the collocational expressions considered to be “good scientific style”, since conforming to those
conventions will provide scientists with the phraseological competence necessary for effective and accurate communication.
Methodology (Corpus data)
The main corpus analysed in this study is the Health Science Corpus (HSC), which is a
representative sample of texts specifically assembled for the current investigation of the use of abstract nouns by the health science community. The data corpus was compiled as part of a research project, „Creation of a Database of Lexical Combinations in Scientific English”,
financed by the Spanish Ministry of Science and Education and FEDER (BFF2001-2988). Within this project, co-ordinated by Dr. Isabel Verdaguer (University of Barcelona), the 4GReLiC research team at the University of Barcelona has developed SciE-Lex, a dictionary
that provides contextual information on usage as well as the combinatorial potential of words commonly used in scientific registers.
Interested as we were and still are in the lexico-grammatical patterns of non-technical terms in scientific English and the conventionalized phraseological characteristics of that genre, 5, the and bearing in mind that there was no corpus of scientific English publicly availableGReLiC research group decided to compile their own micro-corpus, now consisting of 6approximately 4 million words of scientific research articles from prestige online journals that 7cover different disciplines such as medicine, biology, biochemistry and biomedicine.
Computerised corpora and linguistic software tools are essential for linguistic data management. The processing of a corpus by means of computerised methods has proven to be a very useful tool for the researcher to process in real time large quantities of texts, which had been otherwise completely impossible. There are several software programmes at the lexicographer‟s disposal to store a corpus electronically. The software used in retrieving data from the HSC and refining the results further was version 3.0 of the concordancing program 8WordSmith Tools. This program provided a list of words, which allowed us to find out what general terms are most frequently used in scientific English. Taking such a list as the starting point, the abstract noun contribution was selected for the present research paper.
Among the various tools available, WordSmith was extremely useful not only to identify
collocates and their frequency, but also to classify such collocates in terms of grammatical position, word class, semantic category, etc., to analyse word clusters so as to see the patterns of repeated phraseology in the concordance lines analysed and to make generalisations from the observation of repeated language events.
As pointed out above, one of the main aims when compiling the HSC was trying to make a
representative selection of naturally-occurring language in a very specific type of genre, the health science discourse, so as to analyse the collocations and syntagmatic structures associated with abstract nouns in that particular register. To this respect, it seems worth-recalling the notion of “local grammar” (Barnbrook & Sinclair 1995; Hunston & Sinclair 2000; Hunston 2002), which refers to descriptions of particular areas of language (rather than the language as a whole): “the connection between pattern and meaning opens the possibility of quantifying ways of expressing meanings in different registers via the concept of „local grammar.‟” (Hunston
The compilation of the HSC thus understood as “an authoritative body of linguistic
evidence which can support generalizations and against which hypotheses can be tested”
(Sinclair 1987:2) has facilitated the exploration of the phraseological behaviour of abstract nouns in medical English, with respect to patterning by means of corpus evidence. For many research and pedagogical purposes, I do believe that the larger the corpus is, the more reliable conclusions can be drawn from the careful examination of the language shown. Nevertheless, it 9must be stressed that I am fully aware of the fact that the HSC constitutes a sample, a cross-10section of the health science discourse, so claims will be just based on the results obtained
from the in-depth analysis of the HSC data. As Partington (1998) observes: “a corpus, no
matter how large and varied, is only representative of itself and claims made about the behaviour of linguistic items after studying corpus data should bear this in mind” (Partington
1998:146). Following Partington‟s consideration, Hunston (2002) notes that all observations made from a particular collection of texts “must be dealt with as deductions rather than as
facts”. (Hunston 2002:23)
The use of abstract nouns in medical English. Some preliminary considerations.
It must be noted that the decision to analyse abstract nouns was based on the findings from 11a study conducted on the behaviour of the noun conclusion and its restricted collocations in
scientific register. While focusing on that abstract noun, it became apparent that there was a frequent list of comparable nouns, etymologically related to a verb (conclusion ~ conclude;
agreement ~ agree; comparison ~ compare; contribution ~ contribute and decision ~ decide, to
name but a few), which needed more thorough investigation.
Abstract nouns in combination with other parts of speech are frequently used in academic register to refer to scientific processes, methods, evidence and findings. A detailed study of the patterns of abstract nouns in the HSC has revealed a vast amount of phraseological units whose
overall meaning is the result of the interaction among its various elements. As will be shown later, some of the collocates these frequent abstract nouns co-occur with have undergone a 12, which means that these nouns mostly process of „delexicalisation‟ and „semantic bleaching‟
provide the semantic content to the whole unit. In most cases, such abstract nouns collocate with delexicalised or support verbs which contribute very little to the meaning of the whole unit. It is thus the larger unit that is the complete unit of meaning, rather than the individual lexical items.
This phenomenon has already been pointed out in innumerable corpus studies which highlight the importance of multiword units in linguistic production (Sinclair 1991; Gledhill 2000; Altenberg & Granger 2001, Oakey 2002; Stubbs 2001; Simpson 2004). In line with Sinclair‟s idea that meanings are clustered into lexico-grammatical patternings and not in 13isolated terms, the following section focuses on the recurrent sequences of words in
combination with the abstract noun contribution, selected for the present study, that commonly
co-occur in the health science discourse.
In Partington‟s words, the process of „delexicalisation‟ involves “the reduction of the
independent lexical content of a word, or group of words, so that it comes to fulfil a particular function but it has no meaning apart from this to contribute to the phrase in which it occurs.”
(Partington 1993:183). In this respect, expressions such as make a conclusion, reach an
agreement, make a comparison, make a contribution and make a decision must be analysed as
multiword units, where the abstract noun provides the semantic content to the extended lexical string of words to the detriment of the semantic content of the lexical verb. Such restricted verbs have adopted a more grammatical role and simply perform a verbal function.
As Partington (1993) observes, the notion of „delexicalisation‟ is closely related to
Sinclair‟s concept of „shared meaning‟, “a distribution of meaning across a number of words”
(Sinclair 1987b:110), which accounts for the fact that single words and their context of appearance are mutually co-selected: “words in English do not normally constitute independent
selections (…) The item and the environment are ultimately not separable.” (Sinclair 1992:15).
According to Sinclair (1997:323), „semantic depletion‟ is common with high frequency
words, which tend to lose their independent meaning and adopt a more grammatical role. The analysis of V + Noun periphrastic structures in the corpus data reinforces this statement. In this view, the boundaries between a lexical item and its environment become fuzzy. Evidence from the HSC has revealed that delexical uses of verbs co-occurring with abstract nouns are usual. The conclusion to be drawn from this fact is that the more delexicalised a unit is, the more widely it collocates (Partington 1993:183). All these lexico-grammatical issues will be illustrated in the pages to follow by means of a case study on the combinatorial patterns in which the noun contribution occurs.
The collocational patterning of the noun contribution in the HSC
This section describes the collocational patterning of the noun contribution in the HSC,
which will be followed by a discussion of the overall results, with a view to exploring the particular phraseology of the noun under study in the health science discourse. Unlike other abstract nouns, such as conclusion, agreement and decision, the noun contribution shows fewer
entries (249), whereas its corresponding verb, contribute (to) [1071 occurrences] is more
overtly used in the HSC. The examination of the morphology of the noun contribution indicates
that this is a countable noun used to refer to an abstraction. In Biber‟s et al. (1999) view,
“countability is not a simple reflection of things observed in the external world (...) with reference to discrete concrete objects, but also to abstractions which do not so obviously or naturally come as distinct entities.” (1999:242)
Taking into account that an obvious feature of countable nouns is their variation in number, the abstract noun contribution must be regarded as a fully countable entity. From a grammatical perspective, contribution is characterised by number variation (i.e. it inflects for the plural) and its co-occurrence with determiners, mainly central determiners: definite and indefinite articles, demonstrative and possessive determiners. However, it should be mentioned that this noun shows a preference for its base form (166 occurrences) over its inflected counterpart, contributions (83 occurrences). The next subsections will focus on the different patterns this noun collocates with.
a) verb + contribution (to)
Amongst the various verbs combining with this abstract noun, there is a limited group of verbs that stand out as being semantically equivalent to contribute. This group of restricted
collocates consists of the following verbs: make, provide and produce. Consider some 14: examples
(1a) Bruce Weir has made many important contributions to population genetic inference theory.
(1b) Bruce Weir contributed to population genetic inference theory.
(2a) This book should make a significant contribution to the reemergence of the field.
(2b) This book should contribute significantly to the reemergence of the field.
(3a) Embryos were organized like cellular jigsaw puzzles, each cell of which was prespecified to
produce its own precisely delimited contribution to the mosaic that was the developing organism.
(3b) Embryos were organized like cellular jigsaw puzzles, each cell of which was prespecified to
contribute to the mosaic that was the developing organism.
(4a) It provides only marginal contributions to binding as judged by inhibition studies.
(4b) It contributes marginally to binding as judged by inhibition studies.
(5a) These multivitamins did not appear to provide significant contributions to the parameters stated.
(5b) These multivitamins did not appear to contribute significantly to the parameters stated.
As can be seen in the examples above, the periphrastic structures of the type V+
contribution equal the verb contribute, given the fact that they convey the same meaning. In
this respect, it is particularly relevant the fact that make, provide and produce have undergone a
process of „delexicalisation‟, by which they have gradually lost their primary sense of “making,
creating something” in favour of the meaning provided by the abstract noun they collocate with.
From a semantic point of view, there arise two key questions which this analysis attempts to answer: what do the periphrastic structures make a contribution to, provide a contribution to,
produce a contribution to mean? Are they as polysemic as their equivalent, contribute to? The
answers to these two questions are of central importance as one of the main fundamentals lying behind „pattern grammar‟ is that the meaning of one of the items in a collocation is tied to its co-occurrence with the other item. Thus, for instance, the meaning of make in make a
contribution to is semantically constrained by the collocation it occurs with. 15Likewise, the full verb contribute to is highly polysemic in general English. WordNet
identifies four different senses of such a verb: 1) bestow a quality on; 2) provide money, time, knowledge, assistance, etc. along with others to a common supply, fund, etc.; 3) be conducive to and 4) contribute to some cause –for instance, furnish works for publication.
On the contrary, the occurrences of both contribute to and “V + contribution to” found in
the HSC show a narrower field of use. All the examples refer to a more figurative sense of contribute ~ contribution rather than being associated with money, time and the like. A possible paraphrase of these units in the HSC could be “play a significant part / help cause something”. With this sense, the restricted collocates of the type V + contribution to and its related cognate
verb contribute to are semantically interchangeable. This fact goes reasonably well with the phenomenon of „delexicalisation‟ observed in the verbs make, provide and produce, since the
meaning of “playing a significant part” is mainly conveyed by the abstract noun contribution in
these periphrastic structures and by the verb contribute to in synthetic uses.
Moving now on to syntax, there are a few points that should be considered as well. It has been long acknowledged that in scientific writing, writers often make use of the passive voice in order to avoid the recurrent repetition of personal references (i.e., I / my / me; we / our / us)
and to make the text look more impersonal, neutral and objective (Swales 1990; Biber et al. 1998/1999; Hyland 2008). This syntactic feature (form) contributes to a great extent to placing
the emphasis of a given message on processes and experimental procedures (meaning), which
is a widely used device in academic writing.
The use of make a contribution, however, differs from what has been stated above. With a total of 25 occurrences in the corpus, there has only been found one instance of this sequence in the passive:
(6) First, mediastinal tissue analysis of AIDS patients from autopsy revealed that five of seven (71%)
patients had either no thymus or no areas of thymopoiesis, demonstrating that no contribution to
the peripheral T-cell pool was being made by the thymus (...)
Such a preference for active constructions seems to be semantically motivated. The sense of this pattern, which could be paraphrased as “X plays an important part in Y” requires an
explicit specification of the „agent‟ (i.e., the maker / causer of the action described); it can, by no means, demote the Subject because it represents an important focus of attention. Below are some examples that illustrate the gist of this argument:
(7) This amplification makes too small a contribution to the total amount of lac DNA to be detected
(8) The challenge for the future will be to determine not simply that such altered cell biology could have
an effect but that their effects are large enough to make a significant contribution to age-related
(9) Additional transcripts of abundance class D make the largest and decisive contribution to the
Last but not least, notice that in the only example of make a contribution (see example 6
above) in the passive found in the corpus (see Figure 1 below), the prepositional phrase (agent „by‟-PP) of passive sentences has not been omitted (“[...] by the thymus”) Again this underlines
the fact that this typically optional element in passive sentences is considered to be relevant in this case.
Figure 1 Active and passive constructions with the pattern restricted verb + contribution
In sharp contrast, there is a wide group of verbs that combine with contribution but do not
equal semantically the verb contribute. Most of these free collocates only appear once in the
HSC (see Figure 2 and Table 1 below for frequency rates), but they are used to convey a really
wide range of meanings.
VERBS A5VERBS B4VERBS C3
Figure 2 Active and passive constructions with the pattern free verb + contribution
VERBS A (1 occurrence) VERBS B (2 occurrences) VERBS C (3 occurrences) CONSIDER SUGGEST EXAMINE ELIMINATE INDICATE EVALUATE SHED LIGHT ON MODEL INVESTIGATE SUSTAIN ESTIMATE
REFLECT DISSECT (OUT)
Table 1 Frequency rates of the pattern free V + contribution
Especially noticeable is the use of active structures (i.e. only the verbs exclude and assess
are used once in the passive voice) as well as a common preference for verbs connected with scientific procedures: assess, demonstrate, determine, examine, evaluate, investigate, etc.
For the present study, these free collocates have been grouped into different semantic fields,
consisting of comparable verbs that convey similar meanings and, consequently, can be encompassed under the same category. From the very beginning, all these semantic fields seem to follow a logical line of thought that can be summed up as follows:
1) X makes a contribution to Y.
2) Z examines that contribution of / to something.
3) Z evaluates that contribution of / to something.
4) Z excludes that contribution 5) Z appraises that contribution
of / to something. of / to something.
Such a sequence of actions can be expressed by a variety of verbs and described as part of a logical sequence of events, as illustrated in Figure 3:
PROVIDE A CONTRIBUTION
EXAMINE (? OVERLOOK) DISCUSS
MODEL SHED LIGHT ON
DISSECT OUT DISCERN
DEFINE ELUCIDATE A CONTRIBUTION
QUANTIFY A CONTRIBUTION
LIMIT APPRAISE A CONTRIBUTION OR MINIMIZE A CONTRIBUTION ACKNOWLEDGE EXCLUDE ELIMINATE
Figure 3 Restricted and free collocates of the pattern V+ contribution grouped into semantic fields and
described as part of a logical sequence of events
b) adjective + contribution
There is a wide range of attributive adjectives occurring to the left of the abstract noun 16 can be found as modifiers of contribution. Although both descriptors and classifiers
contribution, descriptors are outnumbered by relational and topical classifiers (see Table 2
for frequency counts of modifiers of this noun). As for the former, there are adjectives covering
the semantic domains of size, quantity and extent (small, minor, minimal, heavy, lower,
massive), time (new, early) and evaluation (significant, important, outstanding, biased,
decisive, favorable, functional, sympathetic). The following Table shows the most frequent adjectives in combination with the noun contribution in the HSC corpus:
ADJECTIVES + contribution
EVALUATIVE SIZE EXTENT TIME RELATIONAL TOPICAL
SIGNIFICANT (10) MAJOR (4) MINIMAL NEW RELATIVE (23) CALORIC (2)
IMPORTANT (7) MINOR (3) HEAVY EARLY FUNCTIONAL (4) TECHNICAL
SUBSTANTIAL (3) SMALL (2) LOWER MATERNAL (4) BIOLOGICAL
OUTSTANDING MASSIVE PARTICULAR (3) PHYSIOLOGIC
BIASED FRACTIONAL (2) PLACODAL
FAVORABLE INDIVIDUAL (2) CELLULAR
SYMPATHETIC SPECIFIC (2) NUTRIENT
UNEQUAL (2) GENETIC
Table 2 Frequency rates of descriptors and classifiers + contribution
Although descriptive adjectives do not usually collocate with the noun contribution, it
should be highlighted that the list of descriptors seems to be limited to the constraints of the
genre in question. Scientific writing appears to rely more on relational/classifying (relative,
functional, maternal); affiliative (African) and topical adjectives (technical, physiologic,
caloric, cellular, nutrient, genetic).
This phenomenon is what could be referred to as “stylistic preference”; the most common
adjectives in this type of genre are classifiers because academic writing is concerned with delimiting, defining, classifying and focusing on demonstrable data rather than on making
judgements or personal evaluations, which tend to be more common of fiction and literary
It is wellknown that many adjectives by a process of derivational affixation become adverbs by suffixing -ly to the base form of an adjective. This universal truth in grammar plays an important part when analysing the most frequent adjectives in combination with contribution:
significant, substantial, important. Examining the environment of the verb contribute in the
HSC shows that it usually collocates with adverbs derived from the most common adjectives
combining with contribution: significantly, substantially, importantly.