At the recent eLEX 2011 conference in Slovenia (for earlier posts, see here and here), the discussion focussed on the future of dictionaries – or, more broadly, on the various ways in which reference needs might be catered for in years to come. What often happens in this field is that people working in universities and research groups develop software tools or learning materials for their own (local) users, but their ideas and methods are then taken up by more mainstream providers. Serge Verlinde of Leuven University in Belgium gave one of the keynote speeches. Serge is a great example of someone who has done pioneering work over many years to develop online reference tools, in this case for learners of French. His ‘Base lexicale du français’ (BLF) not only supplies information about word meanings, grammar, and collocation, but also guides the user to make the right word choices when writing in French or translating from French to another language. The BLF also includes a ‘reading assistant’ (providing various kinds of help to enable you to understand a text), and a tool for helping users write a text is in development.
This theme of aids for writing (what are sometimes called ‘text remediation’ tools) appeared in several other talks, too. We heard from Magali Paquot about the Louvain English for Academic Purposes Dictionary (LEAD), a web-based resource designed to assist in the production of academic writing, and from another group about resources being developed in South Africa to aid ‘text production’ in a number of languages.
One of the features of resources like this is that they should be ‘dynamic’: that is, the system should learn from what users do, and what information they look for, and adapt itself as it goes along. Of course, much of this is still in the planning stage: we know what we want to do, but haven’t yet fully worked out how to do it. Part of the solution lies in having computational tools that can, for example, identify errors or unnatural features in a text supplied by a user. One research group from Barcelona showed how it had achieved a success rate of almost 90% in automatically detecting ‘bad’ collocations, and this kind of tool could form one component of a writing assistant that really worked. In the long term, devices like this could replace conventional dictionaries – at least for language production – because they would do one of the jobs dictionaries have traditionally done, but do it much better.
Another theme at the conference was ‘UGC’ (user-generated content), already such a big feature of the online world. News programmes, for example, routinely include information supplied by their viewers and listeners in the form of tweets, emails, or comments posted on their websites. In the world of reference, Wikipedia is the obvious example of a resource created entirely by its users, but the trend is spreading to dictionaries too. Wordnik has huge amounts of UGC, with numerous words added by members of the public, example sentences ‘harvested’ from the Twittersphere, and all sorts of lists created by the site’s users. A new translation tool being developed by the Russian company ABBYY will include a facility for users to contribute their own translations. And of course Macmillan has its Open Dictionary – an ever-expanding record of the most up-to-date uses of English around the world.
The conference provided a perfect snapshot of current activity and thinking in this exciting field. We may not be much closer to knowing how things will pan out over the next ten years (or even the next two). Developments in information technology, and in the skills, needs and expectations of its users, are all racing ahead at breakneck speed, so we can’t make predictions with any confidence. It brings to mind the famous remark made by Zhou Enlai, the Prime Minister of China till his death in 1976. When asked what he thought was the long-term significance of the French Revolution of 1789, he replied ‘It’s too soon to tell’.Email this Post