Advanced Learner English Corpus (ALEC)

ALEC comprises approximately 1.3 million words and 146 texts written by students of English linguistics or English literature who are in their third through fifth year of university studies. The vast majority of the students have Swedish as their L1.

Project leader: Tove Larsson, Uppsala University
Time of compilation: 2013
Size: 1.3 million words
Language: English
Number of texts/samples: 146
Period: 2004–2013
Released: 2013
Funding: The corpus was compiled as part of the compiler's PhD project at Uppsala University
Project home page: http://katalog.uu.se/empinfo/?languageId=1&id=N12-2236

Manual

There is currently no official corpus manual, but each file contains metadata specifying the language background, year of study, etc. of the student.

Compilers

Tove Larsson

Availability

The corpus is unfortunately not freely available at the moment.

Technical information

The corpus is tokenized and encoded in Unicode UTF-8. It comes in XML, HTML and plain TXT formats.