<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Macmillan &#187; language technology</title>
	<atom:link href="http://www.macmillandictionaryblog.com/category/love-english/language-technology/feed" rel="self" type="application/rss+xml" />
	<link>http://www.macmillandictionaryblog.com</link>
	<description>Global English and language change</description>
	<lastBuildDate>Fri, 10 Feb 2012 21:18:14 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=</generator>
		<item>
		<title>How do words get into the dictionary? Part 2: changing times</title>
		<link>http://www.macmillandictionaryblog.com/how-do-words-get-into-the-dictionary-part-2-changing-times</link>
		<comments>http://www.macmillandictionaryblog.com/how-do-words-get-into-the-dictionary-part-2-changing-times#comments</comments>
		<pubDate>Wed, 08 Feb 2012 10:00:47 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[language and words in the news]]></category>
		<category><![CDATA[language change and slang]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[Love English]]></category>
		<category><![CDATA[Open Dictionary]]></category>
		<category><![CDATA[dictionaries]]></category>
		<category><![CDATA[new words]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=22479</guid>
		<description><![CDATA[<br/>In the previous post on this topic, we looked at the criteria traditionally applied by dictionary-makers when considering new words for inclusion. The question is as old as lexicography itself. When he wrote his Plan of an English Dictionary in 1747, Dr Johnson noted that it is ‘not easy to determine by what rule of [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2009/03/drudge.gif"><img class="alignleft size-thumbnail wp-image-1700" title="drudge" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2009/03/drudge-150x150.gif" alt="" width="150" height="150" /></a>In the previous <a href="http://www.macmillandictionaryblog.com/how-words-get-into-the-dictionary-part-1-the-past">post</a> on this topic, we looked at the criteria traditionally applied by dictionary-makers when considering new words for inclusion. The question is as old as lexicography itself. When he wrote his <em>Plan of an English Dictionary</em> in 1747, Dr Johnson noted that it is ‘not easy to determine by what rule of distinction the words of this dictionary were to be chosen’. And having aired his ideas on the subject, he acknowledged that it isn’t always possible to make clear rules and then adhere to them strictly. <strong></strong></p>
<p>The Oxford dictionary website also has a go at explaining its inclusion principles – this time by means of an elaborate <a href="http://oxforddictionaries.com/page/newwordinfographic/how-a-new-word-enters-an-oxford-dictionary" target="_blank">flowchart</a> which takes you through the various decision points. Having cleared numerous <a href="http://www.macmillandictionary.com/dictionary/british/hurdle">hurdles</a>, the successful word is at last included in the dictionary ‘in due course’. I’m not sure I agree with every stage of this. For example, if the question ‘Is its use limited strictly to one group of users?’ is answered with a ‘Yes’, the word is consigned to a sort of <a href="http://www.macmillandictionary.com/dictionary/british/purgatory">purgatory</a> where its behaviour is monitored for possible future inclusion. But dictionaries routinely include vocabulary typical of specific user-groups – the important thing is to apply an appropriate label to indicate that it is not part of the general language. On the whole, though, the Oxford chart gives a good outline of the key criteria: does the evidence come from a range of sources (what we referred to previously as ‘dispersion’), and does it have ‘a decent history of use’(the longevity argument)?</p>
<p>The problem is that the approach applied by both Oxford and <a href="http://www.merriam-webster.com/video/0032-howaword.htm?&amp;t=1326227263" target="_blank">Merriam-Webster</a> is rooted in the past. It reflects the realities of print-based dictionary publishing – and those days are gone.</p>
<p>What has changed? First, what we’d call the ‘publishing cycle’. When dictionaries existed mostly as printed books, publishers would produce a new edition every four or five years. They collected new vocabulary as it appeared, but they could <a href="http://www.macmillandictionary.com/dictionary/british/long#take-the-long-view-of-something">take the long view</a> on whether something was worth including. We do things differently now. Consider for example the linguistic<a href="http://www.macmillandictionary.com/dictionary/british/fallout"> fallout</a> of the global financial crisis that began in 2008 – just a year after Macmillan published the second edition of its dictionary. With the dictionary now mainly consulted online, we were able to add important new usages, such as the word <a href="http://www.macmillandictionary.com/dictionary/british/credit-crunch"><em>credit crunch</em></a> or the new sense of <em><a href="http://www.macmillandictionary.com/dictionary/british/toxic">toxic</a> </em>(when applied to debts) – without having to wait several years. The second big change, which has been gathering pace since the turn of the century, is that the amount of evidence available to us has grown<a href="http://www.macmillandictionary.com/dictionary/british/exponential#exponentially"> exponentially</a>, thanks to the Web and social media. Thirdly, we’re no longer limited by space constraints. Even the largest printed dictionaries don’t have the infinite amounts of space that online media provide, so they have to be selective. That’s no bad thing: the removal of these limits shouldn’t be a licence to include just anything. But it does allow us to re-think – and broaden – our inclusion policies.</p>
<p>Above all, older notions about &#8216;what gets into the dictionary&#8217; reflect the idea of the lexicographer as a <a href="http://www.macmillandictionary.com/dictionary/british/gatekeeper">gatekeeper</a>, the belief that it is up to us to decide (on behalf of everyone else) which facts about language deserve the special status of  being admitted to a dictionary. This notion of the dictionary having special ‘authority’ (which it confers on the words it includes) is well-established, and still has wide appeal. But it may be incompatible with the priorities and expectations of users of the Web &#8211; especially <a href="http://www.macmillandictionary.com/buzzword/entries/digital-native.html">digital natives</a>. If a word is in common use, people expect to find it in their online dictionary,<em> </em>and they won’t be impressed by the argument that it first requires ‘a decent history of use’. For many users, in other words, speed and convenience, getting a useful answer <em>now</em>, may be more important than authority.</p>
<p>As in so many other areas, one of the impacts of the Web has been a challenge to the old top-down model of one &#8216;expert&#8217; provider and many passive recipients. It isn&#8217;t simply a case of users expecting dictionaries to respond more rapidly to language change – many of them also want to be involved in the compilation process. (Wikipedia is the obvious analogy.) In the final part of this series, we&#8217;ll discuss the implications of &#8216;crowd-sourced&#8217; dictionary content (already a central feature of <a href="http://www.wordnik.com/" target="_blank">Wordnik</a>, for example, and of our own <a href="http://www.macmillandictionary.com/open-dictionary/latestEntries.htm">Open Dictionary</a>), and we&#8217;ll also look at emerging language technologies which might just change everything.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
<p><em><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/how-do-words-get-into-the-dictionary-part-2-changing-times/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>How do words get into the dictionary? Part 1:  the past</title>
		<link>http://www.macmillandictionaryblog.com/how-words-get-into-the-dictionary-part-1-the-past</link>
		<comments>http://www.macmillandictionaryblog.com/how-words-get-into-the-dictionary-part-1-the-past#comments</comments>
		<pubDate>Thu, 02 Feb 2012 10:00:57 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[dictionaries]]></category>
		<category><![CDATA[new words]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=22316</guid>
		<description><![CDATA[<br/>In Kate Atkinson’s recent novel, Started Early, Took My Dog (2010), there’s an exchange between two of the characters. When one of them mentions a large sum of money, we read that Kelly, the other character, ‘suddenly meerkatted to attention’. Does this mean we have a new verb on our hands, to meerkat? Should it be [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2012/01/MacmillanPhotolibrary_56425_Image100_meerkat.jpg"><img class="alignleft size-medium wp-image-22297" title="© Image100" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2012/01/MacmillanPhotolibrary_56425_Image100_meerkat-199x300.jpg" alt="" width="199" height="300" /></a>In <a href="http://en.wikipedia.org/wiki/Kate_Atkinson" target="_blank">Kate Atkinson</a>’s recent novel, <em>Started Early, Took My Dog</em> (2010), there’s an exchange between two of the characters. When one of them mentions a large sum of money, we read that Kelly, the other character, ‘suddenly <em>meerkatted</em> to attention’. Does this mean we have a new verb on our hands, <em>to meerkat</em>? Should it be added to the dictionary? Probably not. Atkinson is doing what most language users do occasionally (and some do quite often): taking a word or phrase the reader already knows, and doing something inventive with it to create a new meaning. In this case, it’s simply a question of making a verb out of a noun – a process we’ve discussed frequently <a href="http://www.macmillandictionaryblog.com/tag/verbing">in the blog</a> – so the reader has no difficulty understanding the sentence. But this is what we’d call an ‘exploitation’ (a one-off, imaginative coinage) rather than a ‘norm’ (something which has become ‘settled’ in the language through repeated use), and this is one of several factors we have to take account of when deciding what to put in the dictionary.</p>
<p>There’s a <a href="http://www.merriam-webster.com/video/0032-howaword.htm?&amp;t=1326227263" target="_blank">video</a> describing the process by which new words are admitted to Merriam-Webster’s dictionaries. It’s quite a complex procedure, and the editor who explains it says that this question – ‘How does a word get into your dictionary?’ – is the one he gets asked most often. It’s an important question, too, so we’re devoting three posts to it. We’ll look first at the ‘inclusion criteria’ which dictionary editors have traditionally used; then we’ll consider how far these are still relevant in the world of online dictionaries; and in a final post, we’ll look at emerging technologies which could eventually make this whole question irrelevant.</p>
<p>Orin&#8217;s recent <a href="http://www.macmillandictionaryblog.com/trending-now">post</a> on <em>Tebowing </em>highlights the general fascination with new words, and explains why some words never make it into dictionaries. And if the public is engaged with the issue of what goes in the dictionary, journalists and reviewers are even more interested. We’ve discussed <a href="http://www.macmillandictionaryblog.com/why-say-pundigrion-when-you-could-say-pun">before</a> how most dictionary reviews focus almost exclusively on new words and meanings, and this reflects an assumption that ‘getting into the dictionary’ confers a special status on the successful word. As Kerry Maxwell points out:</p>
<blockquote><p>For many, the perception is that any word which has gained enough currency to be officially recorded is a &#8216;proper&#8217; word, here to stay for the use of future generations.</p></blockquote>
<p>Kerry’s <a href="http://www.macmillandictionaries.com/MED-Magazine/May2006/38-New-Word.htm" target="_blank">article</a> on how words get into a Macmillan dictionary provides a useful summary of the key issues. There are three main criteria. First, is it a ‘real’ word anyway, or is it simply, like <em>meerkatted</em>, an individual writer’s playful use of language? (Of course, &#8216;exploitations&#8217; like this can turn into norms – and so become dictionary entries – if other people pick up the usage and recycle it often enough.) Second, what does the evidence tell us about our word&#8217;s use? Any linguistic feature (be it a word, phrase, collocation, or meaning) which occurs frequently enough over a long enough period will start to look as if it is ‘part of the language’ – and therefore to deserve its place in a dictionary.</p>
<p>The problem of course is deciding what counts as ‘enough’. And frequency alone is not a sufficient condition because a word may be used with great frequency in a single text – but hardly at all outside it. An extreme example is a word like <em>droog</em>, which appears repeatedly in the novel <em><a href="http://en.wikipedia.org/wiki/Nadsat" target="_blank">A Clockwork Orange</a> </em>(it’s part of the secret language which the main characters use, and means ‘friend’), but has never entered the general language. So ‘dispersion’ – the extent to which a word occurs in a range of different sources – is almost as important as frequency. Third, there is the question of whether the word is appropriate for the particular dictionary we’re dealing with. There’s a big difference between a major historical dictionary (the <em>Oxford English Dictionary</em>, for example, contains over a million headwords, many of them long obsolete) and a dictionary aimed at schoolchildren. So the needs, expectations, and language skills of the dictionary’s intended users are all factors to be taken into account.</p>
<p>Kerry&#8217;s explanation of Macmillan&#8217;s inclusion criteria was written in 2006 – which now seems like a bygone era. At that time, &#8216;the dictionary&#8217; was still (for most people) a printed book of limited dimensions, Facebook and Twitter barely existed, and language technologies in common use today were still in development. In the next post on this topic, we&#8217;ll consider how far traditional ideas about what gets into the dictionary remain relevant in 2012.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
<p><em><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/how-words-get-into-the-dictionary-part-1-the-past/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Seen any simpering men lately?</title>
		<link>http://www.macmillandictionaryblog.com/seen-any-simpering-men-lately</link>
		<comments>http://www.macmillandictionaryblog.com/seen-any-simpering-men-lately#comments</comments>
		<pubDate>Wed, 18 Jan 2012 15:39:24 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[gender English]]></category>
		<category><![CDATA[language and words in the news]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[gender]]></category>
		<category><![CDATA[vocabulary]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=22086</guid>
		<description><![CDATA[<br/>The Macmillan Dictionary got a mention in The Guardian yesterday, when Jane Martinson pondered the use of the word simper. A fellow journalist (male) had tweeted about a lawyer (female) ‘simpering’ at a witness (male) in the ongoing Leveson Inquiry. (The inquiry was set up in the wake of revelations that News International journalists had [...]]]></description>
			<content:encoded><![CDATA[<br/><p>The <em>Macmillan Dictionary</em> got a mention in <em>The Guardian</em> yesterday, when <a href="http://www.guardian.co.uk/media/the-womens-blog-with-jane-martinson/2012/jan/17/adam-boulton-twitter-leveson-rusbridger" target="_blank">Jane Martinson</a> pondered the use of the word <em>simper</em>. A fellow journalist (male) had tweeted about a lawyer (female) ‘simpering’ at a witness (male) in the ongoing <a href="http://en.wikipedia.org/wiki/Leveson_Inquiry" target="_blank">Leveson Inquiry</a>. (The inquiry was set up <a href="http://www.macmillandictionary.com/dictionary/british/wake_16#in-the-wake-of-something">in the wake of </a>revelations that News International journalists had obtained stories by <a href="http://www.macmillandictionaryblog.com/sorry-is-the-hardest-word">hacking into the phones</a> of celebrities, politicians, and crime victims.) ‘Can anyone remember,’ she wondered,  ‘the last time a man was accused of &#8220;simpering&#8221;?’</p>
<p>She&#8217;s right. Corpus evidence suggests that <em>simper</em> is used three or four times as often about girls and women as about boys or men. Not only that, where the word is used about men, there’s sometimes an implication that they are not ‘real’ men (that’s why they simper): we hear from an American writer about ‘Simpering Frenchman Jacques Chirac’ (apologies to our French readers), and there are several cases of gay men described as <em>simpering</em> too. This happens a lot: the only people who <em><a href="http://www.macmillandictionary.com/dictionary/british/flounce">flounce</a> </em>in and out of rooms are women (overwhelmingly), and gay men (occasionally) – but never heterosexual men. (I&#8217;m just reporting what the data tells us, so don&#8217;t shoot the messenger.)</p>
<p>As always, the co-text is instructive: <em>simper</em> appears with adverbs like <em>flirtatiously, seductively</em>, or <em>sweetly</em>, while other verbs found in the vicinity include <em>fawn, pout, blush</em>, and <em>giggle</em> – all words associated (whether we like it or not) with women. This example from the corpus gives a good flavour of how <em>simper</em> is typically used:</p>
<blockquote><p>She preferred male company … and had no time for giggling, simpering girls who cared for nothing but gossip and the price of hair ribbon.</p></blockquote>
<p>As Jane Martinson pointed out, the example given in the Macmillan <a href="http://www.macmillandictionary.com/dictionary/british/simpering">entry</a> has a female subject (<em>She spoke in a simpering tone</em>), and this takes us back to an issue we discussed last year, <a href="http://www.macmillandictionaryblog.com/whats-a-nice-girl-like-you-doing-in-a-dictionary-like-this">during Gender English month</a>: should dictionary editors ignore the evidence and show a man in the example (as a way of combating gender stereotypes), or do we record what we find? No easy answers here, though we have to balance our gender-neutral instincts with a description of usage that’s true to the data.</p>
<p>Much has been written about words that blatantly insult women: <em>slut, harpy, bitch </em>and the like. But <em>simper</em> belongs to a more interesting category – words which belittle women, but which do it just subtly enough that (some) men think they can get away with it. Something similar is happening with <em>feisty</em>, another &#8216;suspect&#8217; word mentioned by Martinson. Again, the data backs her up: <em>feisty</em> is overwhelmingly used about women, and the nouns it frequently modifies include <em>heroine, redhead, tomboy </em>(=honorary male)<em>, lady, gal</em>, and even<em> <a href="http://www.macmillandictionary.com/dictionary/british/filly">filly</a></em>. On the surface, it conveys admiration &#8211; but this is <a href="http://www.macmillandictionary.com/dictionary/british/qualify#qualify_24">qualified</a> by the implication that &#8216;She did well &#8211; considering she&#8217;s only a woman&#8217;.</p>
<p>There is much more to be said on this subject. A man who is quiet and reserved, for example, tends to be described as <em>taciturn</em> &#8211; a word rarely applied to women &#8211; or even &#8216;the strong silent type&#8217;: both positive descriptions. A woman of the same type is just <em>quiet</em>, and probably also <em>shy</em> or even <em><a href="http://www.macmillandictionary.com/dictionary/british/mousy">mousy</a></em>. Or even <em>simpering</em> … Well, maybe we&#8217;ll come back to this another day. Oh, and thanks to Jane Martinson, too, for adding another word (<em>twarrumph</em>) to our growing <a href="http://www.macmillandictionaries.com/MED-Magazine/October2010/59-WTM.htm" target="_blank">collection</a> of Twitter-inspired vocabulary.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/seen-any-simpering-men-lately/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Which is worse – regime or dictatorship?</title>
		<link>http://www.macmillandictionaryblog.com/which-is-worse-regime-or-dictatorship</link>
		<comments>http://www.macmillandictionaryblog.com/which-is-worse-regime-or-dictatorship#comments</comments>
		<pubDate>Fri, 06 Jan 2012 11:42:45 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[language and words in the news]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[spanish English]]></category>
		<category><![CDATA[collocation]]></category>
		<category><![CDATA[words in the news]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=21784</guid>
		<description><![CDATA[<img src="http://www.macmillandictionaryblog.com/wp-content/uploads/flags/Spain.png" width="48" height="48" alt="" title="spanish English" /><br/>In a recent post, we saw that the word jargon – while more or less synonymous with terminology – has a much more negative feel. As always, you can tell a lot about a word by the company it keeps, and a comparison of the adjectives that frequently collocate with these two nouns is revealing. [...]]]></description>
			<content:encoded><![CDATA[<img src="http://www.macmillandictionaryblog.com/wp-content/uploads/flags/Spain.png" width="48" height="48" alt="" title="spanish English" /><br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2012/01/MacmillanPhotolibrary_16920_BrandX.jpg"><img class="alignleft size-medium wp-image-21805" title="© BrandX" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2012/01/MacmillanPhotolibrary_16920_BrandX-197x300.jpg" alt="" width="197" height="300" /></a>In a recent <a href="http://www.macmillandictionaryblog.com/terminology-or-jargon-youre-empowered-to-decide">post</a>, we saw that the word <em>jargon</em> – while more or less synonymous with <em>terminology</em> – has a much more negative feel. As always, you can tell a lot about a word by the company it keeps, and a comparison of the adjectives that frequently collocate with these two nouns is revealing. Both are frequently used with neutral words like <em>technical, specialized, scientific</em>, and <em>legal</em>. But (unlike <em>terminology</em>) <em>jargon</em> is often modified by adjectives such as <em>incomprehensible</em>, <em>impenetrable</em>, and <em>unintelligible</em>. It is sometimes implied that technical terms are being employed simply in order to make an impression or baffle the listener, when jargon is described as <em>pompous</em>, <em>pretentious</em> and – in over 50 instances in our corpus – <em>unnecessary</em>. So jargon is clearly something to be avoided (in fact <em>avoid </em>is one of its most common verb collocates), as this damning example from the corpus clearly demonstrates:</p>
<blockquote><p><em>This new breed</em> [of manager] <em>can be spotted by their willingness to <a href="http://www.macmillandictionary.com/dictionary/british/spout_7">spout</a> incomprehensible jargon</em>.</p></blockquote>
<p>This technique of examining collocates may help to defuse a row that has erupted this week in Chile, where the new education minister, Harald Beyer, has recommended a <a href="http://www.bbc.co.uk/news/world-latin-america-16420413" target="_blank">controversial change</a> to the history textbooks used in the nation’s schools. From now on, the period when <a href="http://en.wikipedia.org/wiki/Augusto_Pinochet" target="_blank">General Pinochet</a> was in power will no longer be referred to as a ‘dictatorship’ but as a ‘regime’. This has, the news story says, ‘provoked outrage among left-wing opposition parties’ – but are they right to be so <a href="http://www.macmillandictionary.com/dictionary/british/incensed">incensed</a>?</p>
<p>The data suggests that <em>regime</em> and <em>dictatorship</em> are in fact very similar in terms of their ‘emotional charge’. The <a href="http://www.sketchengine.co.uk/" target="_blank">Sketch Engine</a> – the software package we use for analysing corpus data – includes a <a href="http://www.macmillandictionary.com/dictionary/british/nifty">nifty</a> tool that allows you to compare the collocates of two words, and the results for this pair are revealing. Both are modified, with more or less equal frequency, by the adjectives <em>repressive, corrupt, tyrannical,</em> and <em>brutal</em>, while <em>authoritarian </em>and <em>oppressive</em> are even more likely to collocate with <em>regime</em> than with <em>dictatorship</em>. It’s a similar story with the verbs. When these nouns are in the subject position, both occur frequently with neutral words like <em>rule</em> or <em>govern</em>. But, if the linguistic evidence is anything to go by, regimes have an even stronger tendency than dictatorships to <em>persecute, murder, oppress, imprison</em> and <em>torture</em> people. So the proposed change in Chile may not after all do much to improve Pinochet’s image – if anything, ‘regimes’ look worse than ‘dictatorships’.</p>
<p>There’s an obvious flaw in this argument: the data here is about the English equivalents, but Chile is a Spanish-speaking country, and the change at the centre of the argument is not from <em>dictatorship</em> to <em>regime</em>, but from <em>dictadura</em> to <a href="http://diario.latercera.com/2012/01/05/01/contenido/pais/31-96123-9-gobierno-enfrenta-criticas-por-modificaciones-de-textos-escolares.shtml" target="_blank"><em>régimen militar</em></a>. Fortunately, the Sketch Engine has a large corpus of Spanish too, so we can run the same comparison. The results are very different, and (not surprisingly) reflect the political history of Spain and Latin America &#8211; one of the commonest modifiers for both nouns is &#8216;<a href="http://en.wikipedia.org/wiki/Francisco_Franco" target="_blank">Francoist</a>&#8216; (<em>franquista</em>). But here too, there is some evidence to suggest that the connotations of &#8216;regime&#8217; are no less negative than those of &#8216;dictatorship&#8217;, with <em>régimen</em> often attracting adjectives like <em>authoritarian, dsciplinarian, fascist</em>, and<em> totalitarian</em>.</p>
<p>As always, comments are welcome, and we&#8217;d be especially interested to hear what our Chilean readers think about this.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/which-is-worse-regime-or-dictatorship/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The future of dictionaries? Too soon to tell</title>
		<link>http://www.macmillandictionaryblog.com/the-future-of-dictionaries-too-soon-to-tell</link>
		<comments>http://www.macmillandictionaryblog.com/the-future-of-dictionaries-too-soon-to-tell#comments</comments>
		<pubDate>Thu, 17 Nov 2011 13:05:45 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[global English]]></category>
		<category><![CDATA[language resources]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[online English]]></category>
		<category><![CDATA[dictionaries]]></category>
		<category><![CDATA[elexicography]]></category>
		<category><![CDATA[lexicography]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=20417</guid>
		<description><![CDATA[<br/>At the recent eLEX 2011 conference in Slovenia (for earlier posts, see here and here), the discussion focussed on the future of dictionaries – or, more broadly, on the various ways in which reference needs might be catered for in years to come. What often happens in this field is that people working in universities [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/11/Bled-image.jpg"><img class="alignleft size-full wp-image-20159" title="Bled, Slovenia" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/11/Bled-image.jpg" alt="" width="307" height="230" /></a>At the recent <a href="http://www.trojina.si/elex2011/" target="_blank">eLEX 2011 conference</a> in Slovenia (for earlier posts, see <a href="http://www.macmillandictionaryblog.com/dispatches-from-the-front-line">here</a> and <a href="http://www.macmillandictionaryblog.com/the-future-of-lexicography-does-lexicography-even-have-a-future">here</a>), the discussion focussed on the future of dictionaries – or, more broadly, on the various ways in which reference needs might be <a href="http://www.macmillandictionary.com/dictionary/british/cater-for">catered for</a> in years to come. What often happens in this field is that people working in universities and research groups develop software tools or learning materials for their own (local) users, but their ideas and methods are then taken up by more mainstream providers. <a href="http://www.kuleuven.be/wieiswie/en/person/u0003500" target="_blank">Serge Verlinde</a> of Leuven University in Belgium gave one of the keynote speeches. Serge is a great example of someone who has done pioneering work over many years to develop online reference tools, in this case for learners of French. His ‘<a href="http://ilt.kuleuven.be/blf/" target="_blank">Base lexicale du français</a>’ (BLF) not only supplies information about word meanings, grammar, and collocation, but also guides the user to make the right word choices when writing in French or translating from French to another language. The BLF also includes a ‘reading assistant’ (providing various kinds of help to enable you to understand a text), and a tool for helping users write a text is in development.</p>
<p>This theme of aids for writing (what are sometimes called ‘text <a href="http://www.macmillandictionary.com/dictionary/british/remediation">remediation</a>’ tools) appeared in several other talks, too. We heard from Magali Paquot about the Louvain English for Academic Purposes Dictionary (<a href="http://www.uclouvain.be/en-322619.html" target="_blank">LEAD</a>), a web-based resource designed to assist in the production of academic writing, and from another group about resources being developed in South Africa to aid ‘text production’ in a number of languages.</p>
<p>One of the features of resources like this is that they should be ‘dynamic’: that is, the system should learn from what users do, and what information they look for, and adapt itself as it goes along. Of course, much of this is still in the planning stage: we know what we want to do, but haven’t yet fully worked out how to do it. Part of the solution lies in having computational tools that can, for example, identify errors or unnatural features in a text supplied by a user. One research group from Barcelona showed how it had achieved a success rate of almost 90% in automatically detecting ‘bad’ collocations, and this kind of tool could form one component of a writing assistant that really worked. In the long term, devices like this could replace conventional dictionaries – at least for language <em>production</em> – because they would do one of the jobs dictionaries have traditionally done, but do it much better.</p>
<p>Another theme at the conference was &#8216;UGC&#8217; (user-generated content), already such a big feature of the online world. News programmes, for example, routinely include information supplied by their viewers and listeners in the form of tweets, emails, or comments posted on their websites. In the world of reference, Wikipedia is the obvious example of a resource created entirely by its users, but the trend is spreading to dictionaries too. <a href="http://www.wordnik.com/" target="_blank">Wordnik</a> has huge amounts of UGC, with numerous words added by members of the public, example sentences &#8216;harvested&#8217; from the <a href="http://www.macmillandictionary.com/open-dictionary/entries/Twittersphere.htm">Twittersphere</a>, and all sorts of lists created by the site&#8217;s users. A new translation tool being developed by the Russian company <a href="http://www.abbyy.com/" target="_blank">ABBYY</a> will include a facility for users to contribute their own translations. And of course Macmillan has its <a href="http://www.macmillandictionary.com/open-dictionary/latestEntries.htm">Open Dictionary</a> – an ever-expanding record of the most up-to-date uses of English around the world.</p>
<p>The conference provided a perfect snapshot of current activity and thinking in this exciting field. We may not be much closer to knowing how things will <a href="http://www.macmillandictionary.com/dictionary/british/pan-out">pan out</a> over the next ten years (or even the next two). Developments in information technology, and in the skills, needs and expectations of its users, are all racing ahead at <a href="http://www.macmillandictionary.com/dictionary/british/breakneck">breakneck speed</a>, so we can’t make predictions with any confidence. It brings to mind the famous remark made by Zhou Enlai, the Prime Minister of China till his death in 1976. When asked what he thought was the long-term significance of the French Revolution of 1789, he replied ‘It’s too soon to tell’.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/the-future-of-dictionaries-too-soon-to-tell/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>The future of lexicography: does lexicography even have a future?</title>
		<link>http://www.macmillandictionaryblog.com/the-future-of-lexicography-does-lexicography-even-have-a-future</link>
		<comments>http://www.macmillandictionaryblog.com/the-future-of-lexicography-does-lexicography-even-have-a-future#comments</comments>
		<pubDate>Fri, 11 Nov 2011 16:15:56 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[global English]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[online English]]></category>
		<category><![CDATA[dictionaries]]></category>
		<category><![CDATA[elexicography]]></category>
		<category><![CDATA[lexicography]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=20202</guid>
		<description><![CDATA[<br/>More news from eLEX2011, the conference on e-lexicography currently taking place in Slovenia. The conference got off to a rip-roaring start as Simon Krek (one of the organizers) outlined a radical vision for a future in which a range of intelligent language tools would be freely available to make communication easier. The functions Simon mentioned [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/11/Bled-image.jpg"><img class="alignleft size-medium wp-image-20159" title="Bled, Slovenia" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/11/Bled-image-300x225.jpg" alt="" width="300" height="225" /></a>More <a href="http://www.macmillandictionaryblog.com/dispatches-from-the-front-line">news</a> from <a href="http://www.trojina.si/elex2011/" target="_blank">eLEX2011</a>, the conference on e-lexicography currently taking place in Slovenia.</p>
<p>The conference got off to a <a href="http://www.macmillandictionary.com/dictionary/british/rip-roaring">rip-roaring</a> start as <a href="http://www.trojina.si/elex2011/Vsebine/keynoteKrek.html" target="_blank">Simon Krek</a> (one of the organizers) outlined a radical vision for a future in which a range of intelligent language tools would be freely available to make communication easier. The functions Simon mentioned include real-time subtitling (so a person in China could follow the U.S. elections on American TV, since everything would be translated on the spot), automatic summarization of complex documents (in your own language or another one), and (most ambitiously) instant speech-to-speech translation.</p>
<p>Some of this may sound like science fiction. But the same could be said of the ideas that visionaries like <a href="http://www.natcorp.ox.ac.uk/jmchs.xml" target="_blank">John Sinclair</a> were putting forward 25 years ago – some of which are now quite mature technologies that we already take for granted.</p>
<p>Dictionaries – in their familiar form, at least – aren’t necessarily part of this longer-term vision. After all, dictionaries evolved in order to fulfil a number of communicative and informational needs that people have, but there may be more efficient ways of meeting those needs. It’s already the case that many people (especially <a href="http://www.macmillandictionary.com/buzzword/entries/digital-native.html">digital natives</a>) no longer turn to a dictionary to find out what something means or how it is pronounced or spelled. In this new model, the user simply expresses a need to resolve a particular language problem, and the computer does the rest. What matters is getting the answer (quickly and reliably) – and the ‘container’ that holds the answer is of no importance. <a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a> (<a href="http://www.macmillandictionaryblog.com/author/adam-kilgarriff">a regular contributor</a> to this blog) put it like this: the dictionary may simply ‘dissolve’ to form one component – albeit an important one – in a much broader operation which can be characterized as ‘search’.</p>
<p>For ‘search’ to work optimally, we need high-quality language resources. Dictionaries are certainly part of this, so are encyclopedias, and corpora. There is interesting work going on in the ‘semantic tagging’ of corpora: the Dutch computational linguist <a href="http://www.vossen.info/" target="_blank">Piek Vossen</a> reported on a project of this type. Essentially, it’s about training computers to do something that human beings are very good at, namely recognising which meaning is meant when several are possible. The challenge is to get the computer to recognise that when it sees the word <em><a href="http://www.macmillandictionary.com/dictionary/british/mouse">mouse</a> </em>in a particular context, it refers to a small rodent, or a computer device, or a shy, quiet person. This is extremely difficult work, but Piek’s paper demonstrated that the task was doable, and that the computer’s success rate was gradually improving.</p>
<p>I can&#8217;t do justice to the range of excellent papers I&#8217;ve attended so far; I feel I&#8217;ll need a couple of weeks in a quiet place just to digest it all. Many talks dealt with issues we&#8217;ve looked at regularly on this blog. The importance of <a href="http://www.macmillandictionaryblog.com/tag/pragmatics">pragmatics</a>, for example, was highlighted by <a href="http://www.mendeley.com/profiles/mojca-sorli/" target="_blank">Mojca Šorli</a>, who had some good proposals for improving the way dictionaries present this information. <a href="http://www.macmillandictionaries.com/features/how-dictionaries-are-written/macmillan-collocations-dictionary/#1" target="_blank">Collocation</a>, not surprisingly, featured in several sessions: how to extract them automatically from corpora; how to integrate them into reference resources; how to cater for users who need more specialized collocational information (for example when writing academic texts); and how computer systems can be trained to identify &#8216;miscollocations&#8217; and propose corrections. And as always, there&#8217;s a lot to learn from resources being developed for languages other than English.</p>
<p><a href="http://www.erinmckean.com/" target="_blank">Erin McKean</a> rounded off the first day with an entertaining talk about the amazing <a href="http://www.wordnik.com" target="_blank">Wordnik</a> site &#8211; one example of a resource that goes way beyond what we&#8217;d expect in a conventional dictionary, and shows (among other things) how engaging user-generated content can be. As good a demonstration as any of a well-known quote by science fiction writer William Gibson: &#8220;The future is already here &#8211; it&#8217;s just not very evenly distributed&#8221;.</p>
<p>There&#8217;s no space (or time) to say more, but there will be a final round-up early next week.</p>
<p>(For a final summary of eLEX2011, see <a href="http://www.macmillandictionaryblog.com/the-future-of-dictionaries-too-soon-to-tell">this post</a>.)</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/the-future-of-lexicography-does-lexicography-even-have-a-future/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Dispatches from the front line</title>
		<link>http://www.macmillandictionaryblog.com/dispatches-from-the-front-line</link>
		<comments>http://www.macmillandictionaryblog.com/dispatches-from-the-front-line#comments</comments>
		<pubDate>Wed, 09 Nov 2011 15:57:33 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[global English]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[online English]]></category>
		<category><![CDATA[elexicography]]></category>
		<category><![CDATA[lexicography]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=20147</guid>
		<description><![CDATA[<br/>Today’s post comes from the beautiful Slovenian city of Bled, where I’m attending a conference called ‘eLEX2011’– or ‘Electronic lexicography in the 21st century’. Regular readers will be aware of how completely the job of producing dictionaries was transformed in the 1980s by the arrival of large language corpora. Those were pioneering times, and the [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/11/Bled-image.jpg"><img class="alignleft size-medium wp-image-20159" title="Bled, Slovenia" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/11/Bled-image-300x225.jpg" alt="" width="300" height="225" /></a>Today’s post comes from the beautiful Slovenian city of <a href="http://www.bled.si/en/" target="_blank">Bled</a>, where I’m attending a conference called <a href="http://www.trojina.si/elex2011/" target="_blank">‘eLEX2011</a>’– or ‘Electronic lexicography in the 21st century’.</p>
<p>Regular readers will be aware of how completely the job of producing dictionaries was transformed in the 1980s by the arrival of large language corpora. Those were pioneering times, and the technology struggled to keep up with our appetite for more linguistic data. Now we are awash with it. It has enabled us to produce much better, much more complete descriptions of language. We’d be wary now of making any definitive statement about a word or phrase without first reviewing its use in real communicative situations. A recent post on the use of <a href="http://www.macmillandictionaryblog.com/i-hope-this-isnt-a-complete-waste-of-time"><em>complete</em></a>, for example,<em> </em>would have been impossible<em> </em>without a decent corpus and the software to interrogate it. And yet, even 20 years after corpora first came on the scene, the dictionary itself didn’t look radically different. This was a revolution at the ‘producer’ end: dictionary-<em>making</em> would never be the same, but for dictionary <em>users</em> the changes weren’t so obvious.</p>
<p>Now there’s a second revolution – and this time it’s at the ‘consumer’ end. Dictionaries first went digital in the mid-90s, as publishers made their wares available on CD-ROMs as well as in printed books. But the changes this brought were incremental rather than revolutionary, and that technology reached its <a href="http://www.macmillandictionary.com/dictionary/british/sell-by-date">sell-by date</a> before it had the chance to cause too much disruption to the way dictionaries looked. The really big changes started in the <a href="http://www.macmillandictionary.com/dictionary/british/noughties">noughties</a> with ‘<a href="http://en.wikipedia.org/wiki/Web_2.0" target="_blank">Web 2.0</a>’, and the pace has been accelerating as <a href="http://www.macmillandictionary.com/buzzword/entries/digital-native.html">digital natives</a> come of age.</p>
<p>The theme of the conference is e-lexicography, but as someone said: what other kind of lexicography is there? This is the second event of its kind. The first, organized by <a href="http://www.uclouvain.be/en-cecl.html" target="_blank">Sylviane Granger and Magali Paquot </a>(who have contributed to our <a href="http://www.macmillandictionaries.com/about/med/the-louvain-connection/">dictionary resources</a> at Macmillan), took place in Belgium in 2009. It was an eye-opener. But the scary thing is that, in the short space of two years, the landscape has already changed significantly. <a href="http://www.macmillandictionary.com/dictionary/british/connectivity">Connectivity</a> rates have advanced so rapidly that, in most parts of the world, high-speed web access is now the norm, or very soon will be. Against this background, the demise of the printed book as a medium for general reference materials – still a topic of debate at &#8216;eLEX2009&#8242; – now looks inevitable.</p>
<p>Another striking change is the rise and rise of mobile computing. I still have an old-school &#8216;tower&#8217; computer at home, but this is a dying format. It&#8217;s hard to believe that the iPad was only launched in January 2010 (six months <em>after</em> the previous eLEX conference) but this and similar mobile devices are becoming increasingly dominant. Sales of tablet computers are predicted to hit 300 million units per year by 2015, with smartphone sales even higher. Dictionary publishers are already engaging with these new media – the <em>Macmillan Dictionary</em>, like many others, has <a href="http://www.macmillaneducationapps.com/">apps</a> for iPhones and iPads. But the longer-term implications of this <a href="http://www.macmillandictionary.com/dictionary/british/game-changing">game-changing</a> technology are far from clear. We’re right in the middle of a second major revolution for lexicography, and no-one really knows where it is headed. That&#8217;s what we&#8217;re hoping to find out at this conference, and the Macmillan blog will keep you up to date with developments.</p>
<p>(For more news from eLEX 2011, see <a href="http://www.macmillandictionaryblog.com/the-future-of-lexicography-does-lexicography-even-have-a-future">here</a> and <a href="http://www.macmillandictionaryblog.com/the-future-of-dictionaries-too-soon-to-tell">here</a>.)</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/dispatches-from-the-front-line/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>5 million books</title>
		<link>http://www.macmillandictionaryblog.com/5-million-books</link>
		<comments>http://www.macmillandictionaryblog.com/5-million-books#comments</comments>
		<pubDate>Fri, 23 Sep 2011 12:20:32 +0000</pubDate>
		<dc:creator>Caroline Short</dc:creator>
				<category><![CDATA[language technology]]></category>
		<category><![CDATA[Learn English]]></category>
		<category><![CDATA[culturomics]]></category>
		<category><![CDATA[microblog]]></category>
		<category><![CDATA[microblogs]]></category>
		<category><![CDATA[Ngram]]></category>
		<category><![CDATA[TED Talks]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=18365</guid>
		<description><![CDATA[<br/>This week&#8217;s &#8216;language in new media&#8217; post is one of the fantastic TED Talks. What we learned from 5 million books uses Google Labs&#8217; Ngram Viewer tool to tell us why exactly a picture is worth so much more than a thousand words! If you&#8217;re new to the Ngram Viewer, you might also like to [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/07/ipad2_white_hand1.jpg"><img class="alignleft size-full wp-image-17012" title="Photo courtesy of Apple Inc." src="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/07/ipad2_white_hand1.jpg" alt="Photo courtesy of Apple Inc." width="200" height="79" /></a>This week&#8217;s &#8216;language in new media&#8217; post is one of the fantastic TED Talks.</p>
<p><a href="http://www.ted.com/talks/what_we_learned_from_5_million_books.html" target="_blank">What we learned from 5 million books</a> uses Google Labs&#8217; Ngram Viewer tool to tell us why exactly a picture is worth so much more than a thousand words!</p>
<p>If you&#8217;re new to the Ngram Viewer, you might also like to read <a href="http://www.macmillandictionaryblog.com/culturomics-and-n-grams">Stan Carery&#8217;s post</a> on the subject from earlier this year.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/5-million-books/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Turning words into music</title>
		<link>http://www.macmillandictionaryblog.com/turning-words-into-music</link>
		<comments>http://www.macmillandictionaryblog.com/turning-words-into-music#comments</comments>
		<pubDate>Thu, 11 Aug 2011 08:00:14 +0000</pubDate>
		<dc:creator>Caroline Short</dc:creator>
				<category><![CDATA[language and words in the news]]></category>
		<category><![CDATA[language technology]]></category>
		<category><![CDATA[language news]]></category>
		<category><![CDATA[microblog]]></category>
		<category><![CDATA[microblogs]]></category>
		<category><![CDATA[technology]]></category>
		<category><![CDATA[twinthesis]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=17148</guid>
		<description><![CDATA[<br/>This week&#8217;s &#8216;language in new media&#8217; post explores the &#8216;melody of microblogging&#8217; in &#8216;The real sound of Twitter&#8217;. 21-year-old student Sam Harman, or &#8220;evil doctor tweet&#8221; as he is sometimes known, has created a programme which turns the global language of Twitter into music. Twinthesis, or &#8216;Twitter powered synthesis&#8217;, harnesses the daily tirade of tweets [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/07/ipad2_white_hand1.jpg"><img class="alignleft size-full wp-image-17012" title="Photo courtesy of Apple Inc." src="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/07/ipad2_white_hand1.jpg" alt="Photo courtesy of Apple Inc." width="200" height="79" /></a>This week&#8217;s &#8216;language in new media&#8217; post explores the &#8216;melody of microblogging&#8217; in &#8216;The real sound of Twitter&#8217;.</p>
<p>21-year-old student Sam Harman, or &#8220;evil doctor tweet&#8221; as he is sometimes known, has created a programme which turns the global language of Twitter into music. <a href="http://samharman.com/2011/03/twinthesis-twitter-powered-synthesis/" target="_blank"><em>Twinthesis</em></a>, or &#8216;Twitter powered synthesis&#8217;, harnesses the daily tirade of tweets and transforms them into musical notes, reflecting the tone of the original tweet.</p>
<p><a href="http://news.bbc.co.uk/today/hi/today/newsid_9555000/9555716.stm" target="_blank">Listen to Sam</a> explaining his project to Jon Kay on BBC Radio 4.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/turning-words-into-music/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Man vs machine: dictionaries and LT</title>
		<link>http://www.macmillandictionaryblog.com/man-vs-machine</link>
		<comments>http://www.macmillandictionaryblog.com/man-vs-machine#comments</comments>
		<pubDate>Thu, 17 Feb 2011 10:35:58 +0000</pubDate>
		<dc:creator>Michael Rundell</dc:creator>
				<category><![CDATA[language technology]]></category>
		<category><![CDATA[linguistics and lexicography]]></category>
		<category><![CDATA[dictionaries]]></category>
		<category><![CDATA[webinar]]></category>

		<guid isPermaLink="false">http://www.macmillandictionaryblog.com/?p=12429</guid>
		<description><![CDATA[<br/>Macmillan runs a series of webinars, which are a bit like interactive lectures that anyone can join in. Coming up in 2011 are speakers such as Lindsay Clanfield and Simon Greenall, and from the same page you can watch sessions from the archive featuring well-known language-teaching experts like Scott Thornbury, Ken Wilson and Sam McCarter. [...]]]></description>
			<content:encoded><![CDATA[<br/><p><a href="http://www.macmillanenglish.com/BlankTemplate.aspx?id=43108"><img class="alignleft size-full wp-image-12486" title="Macmillan Interactive Webinars 2011" src="http://www.macmillandictionaryblog.com/wp-content/uploads/2011/02/webinar.jpg" alt="" width="268" height="142" /></a>Macmillan runs a series of <a href="http://www.macmillandictionary.com/dictionary/british/webinar">webinars</a>, which are a bit like interactive lectures that anyone can join in. Coming up in <a href="http://www.macmillanenglish.com/BlankTemplate.aspx?id=43108">2011</a> are speakers such as Lindsay Clanfield and Simon Greenall, and from the same page you can watch sessions from the archive featuring well-known language-teaching experts like Scott Thornbury, Ken Wilson and Sam McCarter. A couple of weeks ago I did a webinar on the subject of language technology and its impact on dictionaries, which you can find <a href="http://www.macmillanenglish.com/BlankTemplate.aspx?id=53124">here</a> (you can also download the Powerpoint presentation that  the webinar was based around). A word of warning: the first five minutes or so are a little messy (we had a few <a href="http://www.macmillandictionary.com/dictionary/british/teething-problems">teething problems</a> with the technology), but it gets better after that.</p>
<p>While computer technology means we can put our dictionary online, hold webinars, and discuss language issues in this blog, <em>language </em>technology is more about the data and software we use in the background, to help us decide what to say about words. Language technology (LT) is big business. It&#8217;s what powers search engines like Google or sites offering automatic translation. Some LT tools are pretty simple: sites like <a href="http://www.macmillandictionaryblog.com/pick-a-fight">Google Fight</a> or Google&#8217;s <a href="http://www.macmillandictionaryblog.com/culturomics-and-n-grams">Ngram Viewer</a> work by counting the number of times a particular word or phrase is used – and counting is something that computers are good at. That&#8217;s also how we can identify, very reliably, the &#8216;core&#8217; vocabulary of English, which is shown as the <a href="http://www.macmillandictionary.com/learn/red-words.html">red words</a> in the <a href="http://www.macmillandictionary.com/">Macmillan Dictionary</a>. But for more sophisticated tasks like translating, computers have to be trained to understand human language – or at least to perform <em>as if </em>they understand it.</p>
<p>This is hard for computers. Language is full of ambiguities because most words have more than one meaning. Human beings are good at dealing with this, and in most situations misunderstandings are rare. If I say &#8216;I&#8217;m going to the bank&#8217;, the person I&#8217;m talking to doesn&#8217;t need to ask if I mean &#8216;the financial institution&#8217; or &#8216;the side of the river&#8217;. Context will tell them which sense of <em>bank </em>I&#8217;m referring to. But the only thing a machine knows is that <em>bank </em>has two possible meanings (which have different equivalents in other languages), and it has to decide which one fits best. So the main goal of language technology is to enable computers to do what humans do so effortlessly when they communicate with one another. Progress towards this goal is slow but steady. Just this week, an <a href="http://www.huffingtonpost.com/2011/02/17/ibm-watson-jeopardy-wins_n_824382.html" target="_blank">IBM computer</a> has beaten two champion (human) contestants in the popular US quiz show <em>Jeopardy</em>.</p>
<p>Thanks to research in this field, lexicographers now have powerful software that reveals far more about how words behave and work together than we knew just ten or even five years ago – and that&#8217;s the main theme of my webinar. I&#8217;m afraid I probably talked for too long, and when the session ended there was no time to answer anyone&#8217;s questions. So this is another opportunity: if there&#8217;s anything you&#8217;d like to ask about this subject, use the Comments here and I&#8217;ll get back to you.</p>
Note: There is an email link embedded within this post, please visit this post to email it.
]]></content:encoded>
			<wfw:commentRss>http://www.macmillandictionaryblog.com/man-vs-machine/feed</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

