global English language technology linguistics and lexicography

You say “lovely”, I say “great”

mkpa

Stan Carey’s post yesterday was a nice reminder of how a word or phrase can suddenly gain widespread currency simply as a result of fashion. And as with any trend, the kudos gained by the user declines in inverse proportion to the number of users – so that in the end the phrase becomes an overused cliché which discerning speakers avoid because they don’t want to sound like everyone else. So fit for purpose may already have passed its peak. As Stan observed, it is too recent to be found in older corpora. The British National Corpus (collected in the early 1990s) has only two examples of the phrase, where it is used in its original legal sense, referring to items sold to the public: these must – under the UK’s Sale of Goods Act – be ‘fit for purpose’. But the big corpus we use now at Macmillan Dictionaries has 1726 hits for this expression, and most refer to organizations, systems, and even people.

This raises questions about the reliability of our language data – the raw materials from which we make the dictionary – in cases where language is changing fast. For the last 25 years or so, linguists and lexicographers have been using corpus data to learn about the behaviour of words and the way the language system works. But an exciting new research area is opening up, based not on conventional corpora, but on Twitter feeds.

These have the advantage of providing high volumes of very up-to-date examples of language in use. Some of this data is used for what is known as ‘sentiment analysis’, as a way of discovering people’s attitudes on various topics, and tweets have even been used (with some success) to predict the movement of the stock market.  But this material can be used for linguistic research too. One interesting project is looking at differences in language use between men and women. By analysing millions of Twitter messages where the writer can be reliably categorised as male or female, the researchers are able to compare the way particular words or phrases are used. They have produced a website where you can key in words and compare their frequency according to the gender of the writer. Try looking, for example, at these four words of enthusiastic approval, to see which are used more by women and which are favoured by men: lovely, great, brilliant, fabulous.

The ‘detailed query’ function also lets you see how words typically combine. The adjective gorgeous, for example, is used almost three times as often by women as by men – but when men do use it, the nouns it most often modifies are woman and girl, whereas women tend to use gorgeous to describe things like dresses, pictures, and views. I’m not sure what to make of that, but I expect someone will have a theory!

Email this Post Email this Post

About the author

Michael Rundell

8 Comments

Leave a Comment