Background and History

(Source: F-LOB manual, original version)

In 1991 a group of students at Freiburg University were engaged in what at first sight must appear as an almost anachronistic activity: they were keying in extracts of roughly 2,000 words from British newspapers. The sampling model was the press section of the LOB corpus. 1992 saw the beginning of a new Brown corpus. The ultimate aim was to compile parallel one-million-word corpora of the early 1990s that matched the original LOB and Brown corpora as closely as possible, and that would thus provide linguists with an empirical basis to study language change in progress. This aim is spelled out in some detail in Mair (1997: 196). The parallel corpora were compiled to enable linguists to

  1. test at least some current hypotheses on linguistic change in present-day English;
  2. detect changes not previously noticed in the literature through the systematic comparison of lexical frequencies, particularly of closed-class items;
  3. to tackle systematically one of the major methodological issues in the study of ongoing change, namely the inter-dependence of synchronic regional (in our case British vs. American) and stylistic variation on the one hand, and genuine diachronic developments on the other.

An additional advantage of the new British and American corpora is that they provide more suitable databases for a comparison with the Indian, Australian and New Zealand corpora (samples representing language use of the late 1980s) than the original LOB and Brown.

References

Mair, Christian. 1997. "Parallel corpora: A real-time approach to language change in progress." In Ljung, Magnus, ed. Corpus-Based Studies in English: Papers from the Seventeenth International Conference on English-Language Research Based on Computerized Corpora (ICAME 17). Amsterdam: Rodopi. pp. 195-209.