global English linguistics and lexicography

What’s that supposed to mean: chunking – part three


The two previous posts in this series (see here and here) looked at the relevance of “chunking” in language production. In the last blog, I discussed collocation, and showed how integral it is to the use of a word like crime – to the extent that it is almost impossible to use the word without knowing its most common collocates. Corpus data shows that many collocations are frequent lexical items in their own right, and this holds true for other types of chunk. Consider, for example, so-called “lexical bundles” like on the other hand. This has a frequency of about 25 occurrences per million words of text – which makes it slightly more common than, say,  persuade (24.5 per million).  In other words, multiword expressions are often just as frequent as common single words. In other words, by the way, has a frequency of over 18 per million; by the way, as a matter of fact, appears about 11 times per million words; as a matter of fact …well we could continue like this indefinitely. The point is that these are all frequent items, and there is a general consensus (following the work of vocabulary experts like Paul Nation) that the most frequent items in a language are, as a rule, the most useful, and should therefore be taught first.

But “chunking” is relevant not only to how we produce language. The way words typically combine determines how we interpret what we read and hear, too. When a word has more than one meaning, we use context to work out which one is intended. Think of a word like argue: on its own it means nothing – its different meanings are only activated when it appears with other words in sentences like these:

He was sent off after arguing furiously with the referee.
It could be argued that current demand is stimulated by low interest rates.

Claire and I argued over who should pay the bill.
As Sinclair convincingly argues, chunks are central to how language works

Argue has two main meanings: to quarrel, and to make the case that something is true. The intended meaning is signalled by clues in the context: in each of these sentences, there is only one possible interpretation, because some clues belong to one meaning, and some to the other. There is a complex mix here of lexical bundles (it could be argued, as X argues), collocation (argue convincingly/furiously), and syntax (argue that…, argue with/about etc) – so it doesn’t really make sense to ask which is “grammar” and which is “vocabulary”. And as Macmillan’s new collocations dictionary makes clear, collocates have a vital role in signalling which of several possible meanings is intended.

What corpus exploration has shown is that “chunks” are more than just an optional extra that you employ to add a sense of naturalness: as John Sinclair points out*, this aspect of language is not “a minor feature, compared with grammar” but “at least as important as grammar in the explanation of how meaning arises in text”. Fluent processing of language – whether you are in receptive or productive mode – requires a knowledge of frequent word combinations or chunks.

Sinclair, J: Corpus, Concordance, Collocation, p112

Email this Post Email this Post

About the author

Michael Rundell


  • I would like personally thank you for your non-stop work.My students (about 100 people) bought Macmillan dictionary.all of them keep thanking me because they like it very I felt that it’s more for you because I only recommended.So, a lot of thanks from my students!

  • As an English teacher I usually point out the common ways native English speakers express word combinations even though I know what a student means when they use ways that are not common. I find myself thinking “oh!, s/he means blah blah blah.” I also find it slows down my ability to listen carefully to the content if a language learner unknowingly is making up their own expressions all the time. I also would imagine that when language learners get out into the real world using their “new language” they might get laughed at for speaking “funny English.” What teacher would tell an adult student there English is good if they said “There was a big wind on the day the man did a bad crime.” I think that sentence sounds suitable for an 8 year old not a 28 year old. Good article!

  • I apologize for coming to this rather late, almost 3 years after the post, but the information presented here distorts the facts. Yes, the two examples Michael picks happen to be relatively frequent. But they are not representative, and by no means could we “continue like this indefinitely.” In fact, Shin & Nation estimate that only about 84 chunks are as frequent as the most common 1,000 word families. And that’s 84 out of a roughly 4,698 matching their criteria to the top 1,000 pivot words (see the paper for an explanation).

    310 chunks would merit being taught along with the 2,000 high-frequency words families, and the number rises to 570 if you put the cutoff at 3,000 words.

Leave a Comment