International Corpus of English – the Scottish component (ICE-SCO)

It is the aim of the project to collect a 1-million-word corpus of spoken and written 21st century Scottish English. The corpus will contain the text categories and annotations specified by the ICE project plus a high number of additional linguistic annotations. The corpus creation process is agile, which means query-driven, based on a cyclic processing model and following the minimal effort principle (see Voormann & Gut 2008).

Project leaders: Ulrike Gut and Robert Fuchs, University of Münster
Time of compilation: 2014–
Size: 1,000,000+ words
Language: English
Period: 2013–2016
Released: in preparation
Project home page: http://www.uni-muenster.de/Anglistik/Research/EngLing/research/ice-sco.html

Reference line and copyright

Schützler, Ole, Ulrike Gut and Robert Fuchs (to appear). New perspectives on Scottish Standard English: Introducing the Scottish component of the International Corpus of English. In Beal, Joan and Sylvie Hancil (eds.). Northern British English (working title). Berlin: de Gruyter.

Compilers

Ulrike Gut
Elvira Hadzic
Ole Schützler
Jennifer Smith
Laura Sollgan
Silke Stagg
Holger Voormann
Sarah-Loana Weiß
Daniel Zerner
Robert Fuchs

File format

The corpus will be available in an XML format.

Availability

The corpus is currently being compiled. At the time of writing (Nov. 2015), the corpus comprises 383,000 words. Researchers who would like to use the currently available data (including the audio files for the spoken part) for their research are encouraged get in touch with the project leaders.