language resources language technology life skills linguistics and lexicography

Corpus linguistics in a MOOC – the future of education?

© BananastockToday’s guest post comes from Tony McEnery, Professor of Linguistics and English Language at Lancaster University, and a leading figure in the world of corpus linguistics.


If somebody had told me that, when I agreed to do a massive open online course (MOOC) for corpus linguistics, I would be crowd-sourcing word meanings from thousands of enthusiastic students in places as far-flung as the British Antarctic Territory and the Pitcairn Islands, I doubt I would have believed it. But that is precisely what happened recently on the course I am running on the FutureLearn platform. Two weeks ago thousands of students checked their intuitions about the meaning of words against corpus evidence and were just as surprised by the results as the lexicographers in pioneering corpus-based dictionary building projects were thirty years ago. The students’ intuitions were typically pretty similar, and the points that the data gave them were, fairly uniformly, surprising to them. The interest – and surprise – for the students was immediate and a joy to behold. In the following week we crowd-sourced analyses of stories about refugees and asylum seekers from around the world. When comparing their results to those presented to them from a corpus-based study I had done, in which the British press represented these groups negatively, an interesting result came in. Many of the negative representations you can see in the British press are present in whole or part in the press around the world. That includes non-English-language news reporting too.

In teaching corpus linguistics to classes over the years I have always stood by a simple truth – when you explore data with students, while you have the capacity to surprise them with what you have found in a corpus, students are also empowered to do just the same to you. With a MOOC the same is true, but the scale shifts. I can surprise and inform thousands of students with insights into the role corpora have to play in lexicography and how partial intuited meanings can be. At the same time, they can turn around and, almost with one voice, tell me that what I thought was true of the language of British newspapers is true in many contexts and many languages worldwide. The scale of such an endeavour is breathtaking. We have been running training classes, free of charge, in corpus linguistics at Lancaster University each year for some time now. In the first week of the MOOC we had taught as many students in one week as we would have if we ran our summer school for 100 years. In terms of getting the idea of the corpus approach to language ‘out there’ our MOOC has been a massive success – and the students like it, which, I must concede is a relief! So much effort and thought went into it that it is really gratifying to see so many people get so much out of it.

Are MOOCs the future of education? Well, in my opinion, yes and no. Yes – we must use them. For some students this is their best shot at getting some instruction from people who are too far away and too expensive to access for face-to-face education. For others, it is a way of dipping their toe in the water – they can find out the rudiments of a subject to see if they want to invest more time in studying it. But then also no – MOOCs must live with, and complement, face-to-face teaching, in my view. The responsiveness and immediacy of face-to-face teaching cannot be readily provided via a MOOC. If nothing else, the scale of the enterprise defies any credible and sustained attempt at building a rapport with individual students, which is, in my experience, a key motivator for students and staff alike. As we move forward we must blend MOOC and face-to-face education, to the benefit of all. Anyway – I must stop writing now. I have to prepare for the next crowd-sourced bit of research the students will do: looking at the ways in which people in different parts of the world talk about disabled people. I am bound to find out something new!

Email this Post Email this Post

About the author


Tony McEnery


  • As a student participant on this corpus MOOC, I would like to say how much of a thoroughly rewarding experience it has been over the last seven weeks, with one more week remaining. I entered my first week of the course with a very basic working knowledge of what a corpus is, how to analyse one, and what corpus linguistics can bring not only to the field of linguistics, but many other areas of investigation. Barely a month and a half later, following the excellent tutorials, guides and peer/mentor/facilitator within the MOOC, I am now using software such as Antconc and CQP web to analyse corpora of millions of words, in a variety of languages, to look for things such as keywords, collocations, and semantic preferences.

    The MOOC is excellently organised, very well supported, and has a wealth of videos, tutorials, guides, tasks, transcripts, discussion threads and keen participants.

    The one issue I am struggling with is trying to balance my life outside the MOOC with the one inside it! There is so much content in there of interest and relevance to my studies that I get completely absorbed in watching a video seminar or carrying out my own analysis following one of the tutorials. But surely this is one of the better things to have an addiction to.

  • Hi Tony, this is great – I’m impressed that you have made such strides in using corpora with students – and with such large numbers of people involved. The notion of using surprise – challenging students’ intuitions with corpus evidence – as a jumping-off point is a stroke of genius. As Keith points out above, once you start, you’re hooked.

  • Thanks Keith and Gill – it has been a humbling but also thrilling experience to get so *many* people interested in corpus linguistics. It has also been huge fun! We are running the course again in late September and hope to run it every September from then on.

  • Thanks for this Tony, as a lexicographer and having recently written an article explaining the term MOOC for BuzzWord, it’s really fascinating to hear about how phenomenally popular, and rewarding, they can be in practice …

  • I would like to thank Tony McEnery, all moderators and contributors for this wonderful course! I started this course with almost zero knowledge about corpus linguistics and its methods. I am not saying that now I can do profound and deep research but I am definitely in love with this course-) It is well structured, clearly presented, and gives a great variety of info I would never be able to get here in Ukraine. I am sure I will connect my future with corpus linguistics as it is what I really want to investigate. Especially, I am excited how it can be applied to social sciences and discourse analyses. Thank you so much for this wonderful course. Though, feel a little bit sad it is coming to its end this week.-)

  • Well Victoria, we are also sad it is coming to an end – which is why we will be back, running the course again, on 29th September!

Leave a Comment