John Sinclair (ed.), Collins Cobuild English Collocations on CD-ROM, London: HarperCollins 1995.

 

Commonly occurring sequences of words, technically known as collocations, are a major source of difficulty for learners of English. There has been an astonishing dearth of publications in this area of lexicographic research, with the result that German speakers of English have so far had to rely on just two combinatory dictionaries, the Dictionary of English Words in Context (Friederich 1979) and The BBI Combinatory Dictionary of English (Benson 1986). 1995 brought us another interesting specimen in the form of Collins COBUILD English Collocations on CD-ROM, distinct in two major ways from its predecessors:

 

- Firstly, it is only available in electronic form. This, however, should by no means deter future users. The software, using the well-known Windows environment, has clearly been designed for ease of use. All the usual Windows functions can be activated at the click of a button. Search results can be saved and printed.

 

- Secondly, it has been compiled by computer, with the lexicographers just making general decisions about the number of entries, collocations and examples to include.

 

The dictionary contains 10,000 entries, or 'node words', and 140,000 collocations, so that for each entry, there are on average 14 node/collocate pairs. Each of these word pairs is illustrated by a set of up to 20 citations culled from the Birmingham-based Bank of English, a vast collection of written and spoken texts running into hundreds of millions of words. The examples appear as sorted concordance lines and can be expanded to more than sentence length. This is a truly innovative feature of the work under review and a definite improvement on earlier generations of combinatory dictionaries, marred slightly, however, by the fact that source texts are attested only in very general terms, e.g. 'Source: Books (British)' or 'Source: Journalism (American)'.

 

Sadly, the dictionary has little else to commend it. It is flawed in a number of ways and, on the whole, compares unfavourably with its predecessors. Firstly, the computer analysis is based on what is clearly too broad a notion of collocation[1] to be of any practical use to learners. It includes compounds (disaster relief, schools inspector), which can be found more readily in a general bilingual or monolingual dictionary, and, most inappropriately, free combinations[2] (thus we find new as a collocate of gallery, or such + disaster). Worse still, it is not exactly rare for the unsuspecting browser to stumble on such manifest absurdities as nature + because, religious + between, or advances + heavy, all of which reveal a regrettable lack of human intervention in the compiling process. Other consequences of this policy include the facts that there is no meaning differentiation between different uses of the same headword (cf. e.g. the node distinction and the relevant collocate list, reproduced below) and that two or more inflected forms of the same word appear as distinct collocates, as in the case of farewell + bade and farewell + bidding[3]. Far (but not well) is correctly given as a collocate of behind, but also as one of lag, purely because lag co-occurs significantly with behind. According to the Cobuild editorial team, holiday collocates with off, but not with on. If one takes a closer look at the text samples provided for this pair, however, one will find such sentences as 'We're going off on holiday' ...

 

The indiscriminate computer-driven compilation policy just illustrated means that there is little space left for those highly frequent word patterns that are of vital interest to non-native speakers of the language. One searches in vain for such common usages as - to take up some of the node words mentioned earlier - an unmitigated disaster, a strict / clear-cut distinction, to lag badly behind, to sit in traffic / in a traffic jam or fashionably late, to say nothing of less frequent combinations such as raw nature (a perfectly acceptable collocation a colleague of mine had found in one of her students' exam papers and felt unsure about). In a sense, of course, the corpus of citations is a redeeming feature here in that it allows the user to retrieve at least a few more collocates by looking through the example sentences. Thus, five minutes' research on the node 'enthusiasm' will yield bring enthusiasm to, show enthusiasm, stimulate enthusiasm, receive enthusiasm, have enthusiasm, etc., but the fact remains that common collocations recorded in other dictionaries, such as arouse enthusiasm (Benson 1986, p. 87) cannot be located. Plainly, then, the dictionary under review, although claiming to be a 'comprehensive database', leaves much to be desired from the teacher's and the translator's point of view. Both would need a complete, instant-access picture of a word's collocability range. The Cobuild lexicographers would do well to use more sophisticated search software[4], to bring their human expertise to bear on the data thrown up by the computer and to devote more space to the collocational properties of nodes and somewhat less to citations. In its present form, the work under review only meets the needs of lower-level learners of English, but is probably less useful to them than a general monolingual learner's dictionary.

 

To end this review, here is an illustration of the differences in depth of coverage between a) a 'comprehensive' combinatory dictionary and b) the CD-ROM under review. The noun distinction will serve as an example:

 

a) distinction n. ('differentiation')

 

1. to draw, make, establish a d. / little d. / no d.(s) 2. to blur, mitigate, remove, eliminate, reduce, erode a d. 3. to adopt, use, recognise, respect, enforce, redress a d. 4. a d. becomes / grows blurred, blurs, becomes unnecessary / otiose / ..., disappears 5. a(n) sharp; strict; hard; neat; coherent; marked; clear; obvious; clear-cut; careful; broad; dubious; fine; subtle; minute; precarious; fancy d. (+ free combinations along the lines of: a(n) linguistic, lexical, grammatical, moral, legal, categorical, generic, dogmatic, logical, illogical, crucial, important, etc.) 6. a d. between 7. a d. from (they are treated without the slightest d. from other children) 7. without distinction (the whole human race without distinction between individuals) 8. a d. without a difference (= an unnecessary distinction)

 

['eminence', 'superiority', 'honour', 'special mark']

 

1. to enjoy, have, hold, claim a d. / the d. of being, having been (also in a negative sense) 2. to get, win, earn, achieve a d. 3. to award a d. 4. a(n) doubtful, dubious, spurious, unique, unusual, unsought; social, academic distinction 5. great, rare, considerable, honourable, real, genuine, clear d. (an artist of [...] distinction; pieces of furniture of distinction; to serve, play, perform with / without [...] distinction; the considerable distinction of panel members; there was real distinction about his performance) 7. a d. for (bravery) 8. quality and d.

 

b) node: distinction - collocate list: between, make, made, no, clear, important, makes, draw, great, making, without, drawn, served, dubious, having, sharp, class, crucial, rare, real

 

Dirk Siepmann


 



[1] For a concise summary of various definitions of 'collocation', see Roberts, R.P., 'Phraseology: The State of the Art', Terminology Update 2 (1993): 4-8.

[2] F. J. Hausmann ('Wortschatzlernen ist Kollokationslernen. Zum Lehren und Lernen französischer Wortverbindungen', Praxis des neusprachlichen Unterrichts 4 [1984]: 395-406) calls these 'Ko-Kreationen' and rightly distinguishes them from collocations. 'Ko-Kreationen', he argues, pose no major problems for the learner.

[3] While this might still be defensible with regard to some entries on the ground that a collocation occurs more frequently in one tense rather than another (e.g. disaster struck), it is clearly unacceptable in such cases as distinction + make / making / made.

[4] See, for example, Smadja, F., McKeown, K. and Hatzivassiloglou, V., 'Translating Collocations for Bilingual Lexicons: A Statistical Approach.' Computational Linguistics 1 (1996): 1-38.