The familiar question of “how words get into the dictionary” is harder to answer when the dictionary is online. Printed dictionaries have limited space, so we have to be selective. This contributes to the popular view of lexicographers as “gatekeepers” – the people who decide, on behalf of the rest of the population, which words are “good” enough to get into the dictionary. To make these selections reliably, good dictionary publishers establish clear “inclusion criteria”, based on what we know about the needs and interests of our users. (Macmillan’s inclusion policy is always evolving. Our current position is explained here.)
But an online dictionary has enough room to include whatever it wants, so what happens to these criteria? Do we simply open the doors and let in anything and everything? The short answer is no, but it’s an interesting challenge to formulate new guidelines for the new situation we find ourselves in. A good way to start is to look at three possible candidates for inclusion which I’ve come across in the last week. These are:
(2) serious untoward incident
The first comes from a newspaper article about mice infesting people’s homes:
Over the years, in my many experiences of mousemageddon, I’ve identified a common pattern of human responses to this invasion. (Rosamund Urwin, London Evening Standard 10 July 2014)
I discovered the second while looking at data for the adjective untoward. It quickly became clear that this is an established term in the field of hospital management, with 30 or so examples in the corpus, including:
The death of Patient P was treated as a serious untoward incident.
Pulse found there had been nine ‘serious untoward incidents’ in four of the seven pilot sites.
Investigating a serious untoward incident and preparing a report can be a difficult and time-consuming process.
Finally, nomakeupselfie came fourth in a poll where users voted for the word they most wanted to see in the Collins dictionary.
All three raise interesting questions.
In the case of mousemageddon the writer is discussing her war on mice, and uses a blend (mouse+armageddon) to encapsulate the idea in an entertaining way. People make up words all the time – it’s one of the brilliant things about language. Some get picked up by other writers or speakers, perhaps because they express a useful concept in a neat new way, and if the word starts being widely used, it deserves a place in the dictionary. But most fall by the wayside and are quickly forgotten – and this will probably be the fate of mousemageddon.
Serious untoward incident, on the other hand, is shown by corpus data to be a genuine term. So should it go in the Macmillan Dictionary? This is where we have to think about the dictionary’s function. The Macmillan Dictionary is a general-purpose dictionary used primarily by people in the English-as-a-second-language field (teachers, students, professional users), and it aims to provide an up-to-date record of English as it is used around the world. As lexicographers, we need to ask – when confronted by something like serious untoward incident – “is this the kind of word we’d expect our users to need to know about?” In this case, probably not. We certainly do include technical vocabulary from a range of special fields, such as economics, linguistics, biology, and indeed medicine. But this is a term used solely between experts in one quite specialised area of medicine. There is no evidence for it being used outside such contexts, no evidence for it “going mainstream”, so it stays out.
Which leaves nomakeupselfie. There are two problems with this. First, it is a “transparent” compound: its meaning is easy to work out as long as you know what a selfie is. People often put words together like this – a newspaper I read yesterday described an easy task as a simple “do-it-in-your-lunchbreak-while-watching-Neighbours mission”, but no-one would expect to find this 8-word compound in a dictionary. The second problem is that nomakeupselfie isn’t a word people use in normal discourse. It is a Twitter hashtag, and there is no obvious reason to add the many thousands of these to a dictionary. And while we are on the subject of selfies, the word “felfie” (a selfie of a farmer, apparently) came second in the Collins poll mentioned above. Commenting on this in the Economist, Robert Lane Greene, makes the point that “Lexicographers should be logging the words people actually do use, not the ones they say they like. … it is easy to imagine people voting for a cute coinage they would never actually utter or write”.
He is right. I hate to spoil the fun of all these logophiles voting for silly “words” like felfie, but a word doesn’t become valid just because you like the sound of it. Sure, in the online environment we’re not as strict about inclusion as we had to be when working on paper, and that’s a good thing. But evidence of usage remains key to any decision about goes in the dictionary. This is why the guidelines for our Open Dictionary include this suggestion:
We are looking for “real” words: that is, words which you know, and which you have seen or heard in use. We do not include words which have simply been invented.
I’m happy to report that the many people who contribute to the Open Dictionary take this seriously, and it’s rare for anyone to submit a word which they have just made up with no communicative purpose in mind. Everything comes back to evidence of usage, and if a dictionary isn’t evidence-based, it is unlikely to be much good. Over 150 years ago, Richard Chenevix Trench – in a famous lecture to the Philological Society in London – outlined the principles of good lexicography, and much of what he said holds good today. A key passage of his lecture includes this:
A Dictionary is an inventory of the language…It is no task of the maker of it to select the good words of a language…The business he has undertaken is to collect and arrange all the words, whether good or bad …which those writing in the language have employed.
There are two valuable principles here: don’t exclude words simply because you dislike them, and equally, don’t include words if there is no evidence that “those writing [or speaking] in the language” have ever used them.Email this Post