language technology linguistics and lexicography Love English

Tapping the brain for words

© Digital Vision / PunchstockOur next guest post comes from Doug Higby. Doug is with SIL International where he coordinates training and promotion of technology for advancing language-based development in the thousands of languages where SIL works.

__________



If you were to build a dictionary from scratch, how would you go about it? Would you start with ‘a’ for aardvark or would you find some other way to think of words, alphabetizing them later? Happily, most of us English speakers have never had this dilemma. We just reach for the nearest dictionary or web browser and look up the word in question. However, many languages exist today that still don’t have a significant dictionary.

As a linguist, I worked in Mali for many years with the Fulfulde language of the Fulani people. I had collected at least a dozen different bilingual word lists that others had made, but none of them was very comprehensive. At the time, the process of collecting words involved writing them down as they were encountered during linguistic fieldwork – a slow and tedious process that resulted in dictionaries of only a few thousand words after years of work.

That was nine years ago. If I were to start all over today, I wouldn’t bother with existing word lists but would instead harness the power of the community and the human brain. It would be a collaborative effort to build the most comprehensive Fulfulde dictionary ever with a proven method called Rapid Word Collection.

The fact is, our brains don’t store words alphabetically. Picture yourself mentally flipping through a card catalog to find the right word to use in conversation. Your speech would be punctuated with awkward pauses when the lookup fails due to incorrect spelling! Fortunately, our brains store words in semantic domains – areas of meaning. When that area of the brain gets primed, access to related words happens easily and quickly. Words are linked to other words and other domains, forming a whole network of meaning. You can visually approximate this on the web by visiting snappywords.com, which does a great job of displaying the interrelatedness of words.

To help cull words from deep in our mental network, Ron Moe of SIL International has developed a system of 1800 semantic domains, arranged in hierarchical fashion. For each of the domains, there is a series of questions designed to elicit words. For example, in the domain entitled “Cooking methods,” one of the questions is:

What words refer to various ways of cooking food?

In English we have words like cook, bake, baste, boil, braise, brew, and broil, just to name a few. If you sat down together with a group of English speakers, you could probably come up with a dozen more. This process is essentially what is done in Rapid Word Collection. Moe designed this process which SIL later refined to capture better data. It is done in just two weeks with about 30 native speakers of the language working in six separate teams. All the words are typed in the computer, complete with translations and semantic domain references. The result is a fairly comprehensive lexical database and thesaurus that will serve as the foundation for any further dictionary work.

In January of 2012, armed with three new laptop computers, I headed to Sandema, Ghana with Ron Moe and Art Cooper to try this new approach on a different language: Buli. We were met by the Buli literacy team and a group of 30+ local volunteers, excited to be a part of the effort to preserve their language and increase the education opportunities for their children. Our goal was to see how many Buli lexemes with English word glosses could be collected and entered by systematically going through Ron’s semantic domain questions. In two weeks, we amassed a database comprising 14,747 word senses, associated with approximately 10,000 distinct lexemes! All this is the result of tweaking the process of collecting words to align with the brain’s internal process of storing them.

The complete Rapid Word Collection process is documented and repeatable by following the instructions on the rapidwords.net website. While you’re there, don’t miss the video on the Buli word-collection workshop!

Email this Post Email this Post

About the author

Avatar

Doug Higby

1 Comment

  • Thanks for this amazing post, Doug, I urge everyone to have a look at the video on rapidwords.net – it’s inspiring (and a great advertisement for a low-tech but highly effective form of crowd-sourcing).

Leave a Comment