Old English

Leena Kahlas-Tarkka, Matti Kilpiö and Aune Österman

(Adapted, with the publisher's permission, from Rissanen, Matti - Merja Kytö - Minna Palander (1993), Early English in the computer age: explorations through the Helsinki Corpus. Berlin - New York: Mouton de Gruyter. Pp. 21-32. Minor changes have been made to remove outdated statements. Note a has been added for update information.)

1. Introduction

The texts representing the Old English period are divided into four subsections. Each of these covers a century with the exception of the earliest (OE1), which consists of writings prior to the year 850. Both prose and verse are included, although verse texts are in a clear minority. The size of the subsections, as well as the amount of prose and verse, are indicated in Table 1.

Table 1. The Old English period in the Helsinki Corpus: A quantitative overview

Subperiod

Words

%

Prose

Verse

OE1–850

2,190

(0.5)

1,960

230

OE2850–950

92,050

(22.3)

91,680

370

OE3950–1050

251,630

(60.9)

174,010

77,620

OE4 1050–1150

67,380

(16.3)

67,380

-

 

413,250

(100.0)

335,030
(81.1)

78,220
(18.9)

The average size of the text samples from longer texts is between 5,000 and 10,000 words. When one and the same text is represented in the corpus by different versions (e.g., the Bible, Gregory's Dialogues, or the Chronicle), the samples cover different passages of the text. This procedure has the obvious advantage of providing a larger sample of one particular text, but excludes the possibility of comparing different versions.

2. Chronological coverage

The starting point of the Helsinki Corpus being variation, the text material should be able to give as representative a picture of the Old English period as possible and there should of course be enough material for comparison. As can be seen from Table 1, the texts are not evenly distributed between the subperiods: for obvious reasons, the subsections OE1 and OE4 are very small. Practically all extant pre-850 texts have been included: Cædmon's Hymn, Bede's Death Song, Ruthwell Cross, The Leiden Riddle, and the earliest documents (Birch 451, Harmer 1, 2, 3, 5, Robertson 3). For the post-Conquest OE4 subperiod relatively limited material is available: late annals of the Chronicle, the Vision of Leofric, and some documents (William's Laws, Robertson's Appendix 1, 3, 4). In addition, some texts which appear in late manuscripts, even though the originals are earlier, have been included in this subperiod. (See also below for an account of dating.)

On the basis of the number of words Old English poetry may seem overrepresented. Its importance for Old English studies cannot, however, be disputed, and therefore a relatively large selection of verse has been included to provide scholars with a representative corpus of poetic language.

3. Dating

Dating the Old English texts and grouping them according to the four-part division of the corpus causes a problem of its own. We have adopted a cautious and conservative approach, relying on previous scholarship and fully acknowledging the difficulties involved. Quite a few texts can be dated by historical evidence (Bede's Death Song, Wærferth's translation of Gregory's Dialogues, Alfredian translations, the works of Ælfric and Wulfstan). The entries of the Chronicle and the battle poems can be given at least termini a quo on the basis of the events they describe (Amos 1980: 1-2). But only a part of the corpus is datable on external or internal evidence.

The texts have been grouped into subsections according to the date of the manuscript sampled. In many cases, the date of the composition of the original version and that of the extant manuscript are different. If both are known, a double coding has been given (e.g., O2/4 for MS C of Gregory's Dialogues). More often, the date of composition cannot be established with any certainty. In those cases it is indicated with X (unknown), e.g., OX/2 for Ine's Laws. The Blickling Homilies included in the corpus have been coded O2/3, Wulfstan's Homilies either O3 or O3/4, Martyrology, Marvels and Alexander's Letter O2/3, Chad O2/4, and Gregory's Dialogues MS H O2/3. As noted above, this principle of grouping the texts has also increased the amount of text for O4, which would otherwise have been considerably smaller.

A similarly cautious attitude towards dating Old English poetry has been adopted. Apart from the pre-850 poems, The Battle of Brunanburh (O2), and The Meters of Boethius (O2/3), all verse texts have been coded OX/3. Thus no stand has been taken regarding various attempts at dating some of the Old English poems with more precision.

4. Types of text

The selection of the Old English samples has been made with attention to a number of text typological features. As can be seen in Table 2 in the General Introduction, the number of text types in the Old English corpus is smaller than in the Middle or Early Modern English corpora. In our period the written language had not yet realized its full potential in all walks of life. Society was developing, still creating its institutions. Gradually, however, learning started to spread through educational activities undertaken by the Church and through educational programmes launched by individual rulers, most notably by King Alfred. These processes are reflected in the increasing variety of texts towards the end of the Old English period. It must also be remembered that we have no idea of the amount of literature lost in the course of centuries.

Our selection of texts and their division into various types is relatively conventional and needs few comments. The law texts represent the category of statutory prose, while the documents are mostly wills and definitions of boundaries and are intended for individual or otherwise specific situations; thus they do not seem to contain the generalizing power required of the term "statutory". Medical recipes, of which there is no shortage in Old English, are typical representatives of the handbook and secular instruction categories. Astrological writings (prognostications) are also classified as handbooks, but Byrhtferth's Manual and Ælfric's De Temporibus Anni are regarded as scholarly treatises and thus labelled as "scientific". The Manual is clearly instructive, but De Temporibus can be regarded as a representative of expository writing.

The categories "imaginative" and "nonimaginative" mark a distinction between fictional prose, such as Apollonius of Tyre, and nonfictional narration, as, e.g., the histories and chronicles. The lives of saints and martyrs are coded as "nonimaginative" writing, because they were obviously written and read with the same serious attitude as, for example, Bede's or Orosius' histories. Because of their clearly fabulous character, Alexander's Letter and Marvels are regarded as "imaginative narration".

In selecting the texts of the four subsections, one problem was the abundance of texts in OE3, in comparison to the three other subperiods. To avoid too obvious overrepresentation some important texts dating from 950–1050 had to be excluded. For instance, there is a minimal amount of text representing "history" in OE3.

Table 2. Old English texts arranged by prototypical text category and text type, with word counts; X = no value

Prototypical text category

Text type

Text

Word count

Total count

OE1 (–850)

 

 

 

 

X

Document

Documents 1

1,960

 

X

X

Cædmon's Hymn

40

Bede's Death Song

30

Ruthwell Cross

70

The Leiden Riddle

90

 

 

 

 

2,190

OE2 (850–950)

 

 

 

 

Statutory

Law

Alfred's Introduction to Laws

1,950

 

Alfred's Laws

3,300

Ine's Laws

2,670

X

Document

Documents 2

2,360

Secular instruction

Handbook: medicine

Læceboc

10,420

X

Philosophy

Alfred's Boethius

10,920

Religious instruction

Religious treatise

Alfred's Cura Pastoralis

17,140

X

Preface

Alfred's Preface to Cura Pastoralis

870

Nonimaginative narration

History

Chronicle MS A Early

13,460

Bede's Ecclesiastical History

10,170

Ohthere and Wulfstan MS L

410

Alfred's Orosius

8,640

X

Bible

Vespasian Psalter

9,370

X

X

The Battle of Brunanburh

370

 

 

 

 

92,050

OE3 (950–1050)

 

 

 

 

Statutory

Law

Laws (Eleventh Century)

6,900

 

X

Document

Documents 3

8,030

Secular instruction

Handbook: medicine

Lacnunga

2,720

Medicina de Quadrupedibus

4,270

Science: astronomy

Byrhtferth's Manual

4,070

Expository

Science: astronomy

Ælfric's De Temporibus Anni

5,360

Religious instruction

Homily

Wulfstan's Homilies

6,950

The Blickling Homilies

10,670

Ælfric's Catholic Homilies (II)

3,130

Ælfric's Homilies (Suppl. II)

1,720

Rule

The Benedictine Rule

9,970

The Durham Ritual

10,550

Religious treatise

Ælfric's Letters to Wulfstan

7,960

Ælfric's Letter to Sigefyrth

1,500

Preface

Ælfric's Preface to Cath. Hom. (I)

1,040

Ælfric's Preface to Genesis

690

X

Preface

Ælfric's Preface to Cath. Hom. (II)

220

Ælfric's Preface to Lives of Saints

360

Ælfric's Preface to Grammar

340

Nonimaginative narration

History

Chronicle MS A Late

670

Ohthere and Wulfstan MS G

1,310

Biography: life of saint

Ælfric's Lives of Saints

6,980

Gregory's Dialogues MS H

5,170

Martyrology

10,270

Imaginative narration

Geography

Marvels

1,690

Travelogue

Alexander's Letter

7,290

Fiction

Apollonius of Tyre

6,530

X

Bible

The Old Testament (Heptateuch)

10,240

The Paris Psalter

8,450

West-Saxon Gospels

9,920

Lindisfarne Gospels

8,750

Rushworth Gospels

10,290

X

X

Fates of Apostles

670

Elene

7,310

Juliana

4,130

Genesis

4,840

Exodus

2,980

Christ

6,130

The Kentish Hymn

230

The Kentish Psalm

840

Andreas

4,860

The Dream of the Rood

1,110

The Wanderer

690

The Seafarer

770

Widsith

850

The Fortunes of Men

550

Maxims I

1,440

The Riming Poem

500

The Panther

390

The Whale

470

The Partridge

90

Deor

230

Wulf and Eadwacer

120

The Wife's Lament

320

Beowulf

17,310

Riddles

5,090

The Metrical Psalms of the Paris Psalter

6,720

Phoenix

3,710

The Meters of Boethius

5,270

 

 

 

 

251,630

OE4 (1050–1150)

 

 

 

 

Statutory

Law

Late Laws

2,100

 

William's Laws

220

X

Document

Documents 4

2,440

Secular instruction

Handbook: astronomy

Prognostications

3,350

Philosophy

Dicts of Cato

2,080

Religious instruction

Homily

Wulfstan's Homilies

3,290

A Homily for the Sixth ... Sunday

1,610

Rule

Wulfstan's Institutes of Polity

4,760

Religious treatise

Ælfric's Letter to Sigeweard

10,180

Ælfric's Letter to Wulfsige

3,230

Adrian and Ritheus

1,090

Solomon and Saturn

2,010

Vision of Leofric

1,010

X

Preface

Alfred's Preface to Soliloquies

440

Nonimaginative narration

History

Chronicle MS E

17,620

Biography: life of saint

Chad

2,650

Gregory's Dialogues MS C

5,100

A Passion of St Margaret

4,200

 

 

 

 

67,380

Of the texts that could be classified under the prototypical text category "Imaginative narration", only Apollonius of Tyre is regarded as "fiction", Alexander's Letter is more of the type "travelogue", and Marvels is placed under the text type "geography" for lack of a better term.

In the subcorpus OE4 the didactic prose texts Adrian and Ritheus and Solomon and Saturn are subsumed under "religious treatises" as no other suitable term was available.

Differing from the Toronto Corpus classification "biography", Blickling Homily 17 is included in the text type "homilies" because it does not seem to give obvious biographical information; instead it shows features characteristic of a homily, in addressing the audience, for instance, and with its description of the dedication of St Michael's Church by the Archangel himself (Wrenn 1967 [1980]: 243).

5. Dialect

Stating the dialectal background of many Old English texts calls for particular caution in the same way as chronology does. The aim of the compilers of the corpus has been to provide as comprehensive a selection as possible. At the same time the limitations imposed by the available material have to be acknowledged. When previous scholarship seems to be more or less in agreement about the dialectal background of a particular text, the samples have been given a simple code value WS (West-Saxon), AM (Mercian), AN (Northumbrian) or K (Kentish). Thus texts like the Chronicle, Alfred's Laws, Cura Pastoralis, Boethius, most of Ælfric's and Wulfstan's writings, West-Saxon Gospels, and document Birch 451 (the only pre-850 West-Saxon text) have been classified as representatives of the West-Saxon dialect. The representation of "pure" Anglian dialects is much more limited: the code value AN has been given to the four early verse texts mentioned above, as well as to The Durham Ritual and Lindisfarne Gospels. AM applies to Vespasian Psalter, Rushworth Gospels (Matthew), and the documentary texts Codex Aureus, Harmer 3 and Robertson 16. The nearest approximation of the Kentish dialect is represented by documents only, namely Harmer 1, 2, 4, 5 and 7, and Robertson 3, 6 and 10.

In most cases, coding for a mixed dialect seems to be the only solution. If a text represents a mixed dialect whose elements can be identified to a certain degree, the code markers WS/A, WS/AM and WS/K have been adopted. Thus texts like Bede's Ecclesiastical History, Læceboc, Lacnunga, The Blickling Homilies, Marvels and Alexander's Letter have been identified as representatives of WS/A. Texts coded WS/AM are Martyrology, Chad, and MS C of Gregory's Dialogues, which clearly shows Mercian elements ascribable to Wærferth's original version. Strong Kentish elements have been identified in The Kentish Hymn and Psalm, which have received the code marking WS/K.

If the underlying second element cannot be identified so as to be used as evidence for a particular dialect, this element is defined as "unknown" with the code X. The majority of texts, including many legal and documentary texts as well as MS H of Gregory's Dialogues, show WS/X to be representative of the developing West-Saxon-based standard. Apart from the Kentish and Northumbrian poems mentioned above, all verse is defined as WS/X, thus acknowledging the difficulty in pointing out specific dialectal features or making definite statements on the dialectal origin of any poems. In deciding upon the dialectal background of several documentary texts, we have relied on the information provided by the editors of that material. Thus documents Whitelock 14 and 15 are coded as A/X, Harmer 12 and Robertson 1, 2 as AM/X, and Robertson 32, the only representative of a mixed Kentish dialect, is coded K/X.

6. Relationship to Latin

The pervasive role of the Latin language in Anglo-Saxon civilization is reflected, among other things, in the fact that many Old English texts stand in some kind of relationship to Latin originals. At the initial stages of planning, we conceived this relationship in terms of the following four values: an Old English text could be a gloss, translation, paraphrase, or independent of a Latin original. As it stands, the text parameter <G> (relationship to foreign original) now has only three values for the Old English period: GLOSS, TRANSL and X, where X stands for "irrelevant or unknown". This calls for some comment.

Glosses are an easily recognizable group which do not cause any particular problems. With translations, however, the situation is complex. Old English translations can be seen to form a continuum with at one end close translations – the West-Saxon Gospels or Bede's Ecclesiastical History would be typical examples – and at the other extreme, paraphrases like many of those produced by Ælfric. Somewhere in between are translations like Cura Pastoralis, which is to some extent fairly close, but in part reflects Alfred's predilection for loose paraphrasing.

The existence of this continuum makes it immediately evident that it is in many cases extremely difficult to decide whether a particular text represents translation or paraphrase: as a result of this difficulty, often heightened by the additional complication that one and the same text shows two or three different relationships to Latin, we in the end abandoned the term "paraphrase" and made the term "translation" cover all possible renderings of a Latin text, whether close or free. This decision in fact meant that we viewed translation in exactly the same way as did the Anglo-Saxons themselves: for them the process embraced both word be worde and andgit of andgiete; in the light of this it is perfectly logical that Ælfric uses one and the same verb awendan with reference both to actual translation and paraphrase (Nichols 1964: 7-9).

As noted above, according to our original plans one of the values of <G> was to be "independent of Latin". There are undoubtedly a few texts, like Alfred's Preface to Cura Pastoralis for which this relationship to Latin can be said to be established beyond reasonable doubt, but generally speaking it is hazardous to claim that a certain text is independent of Latin when in fact the claim is based on an argumentum ex silentio. This is an area, too, where the situation is in constant flux: new Latin sources for Old English texts are being detected or suggested all the time. Thus, the value "independent of Latin" had to give way to X – a solution which is safe, but open to criticism as it lumps together texts which are almost certainly independent of Latin and texts for which Latin sources exist but the relationship between Latin and Old English is complex or difficult to specify (e.g., many homilies and biblical poems).

7. Individual authors

The text parameter <A> indicates the author, where known, of a particular text, whether an original work, a translation or even a gloss. A large proportion of Old English texts, particularly poetry, remain firmly anonymous, but there are at least four authors so well represented in the Helsinki Corpus that it is possible to study their idiolects: Alfred, Ælfric and Wulfstan among the prose writers, and Cynewulf among the poets. The decisions concerning authorship follow a conservative line. For example, the prose part (Psalms 1-50) of The Paris Psalter has not been included among Alfred's writings due to the fact that although this text is now considered by most scholars to belong to the Alfredian canon, complete agreement has not been reached. On the other hand, The Meters of Boethius has been ascribed to Alfred, forming as it does a part of the Alfredian translation of Boethius; we are, however, well aware of the doubts raised about Alfred's authorship. Of the two parallel versions of Gregory's Dialogues, only the one contained in MS C has been ascribed to Wærferth: the revision (MS H), although partly reflecting the wording of Wærferth's original translation, is generally speaking so thorough a revision of the original that we decided to give it the value X for <A>. This was also done with two short, early poems, Cædmon's Hymn and Bede's Death Song. It is true that they have been traditionally attributed to Cædmon and the Venerable Bede; however, in the case of Cædmon's Hymn, the genesis of the poem is shrouded in legend, while in the case of Bede's Death Song a close reading of the appropriate passage in Cuthbert's letter introducing the poem does not unequivocally suggest that Bede was the author of the poem.

8. Origin of the Old English section of the Helsinki Corpus

The Old English section of the Helsinki Corpus is based on a machine-readable transcript of c. 2,000 surviving Old English texts prepared at the University of Toronto as a preliminary step in the preparation of the Dictionary of Old English and also used as the basis of the Microfiche Concordance to Old English (Healey - Venezky 1980) and A Microfiche Concordance to Old English: The High-Frequency Words (Venezky - Butler 1985). [a] The Toronto Corpus (Release 1, October 1982) was made available to us by the kind permission of the Dictionary of Old English and through the services of the Oxford Text Archive. The Old English section of the Helsinki Corpus represents about one seventh of the Toronto Corpus.

As a result of proofreading and subsequent checks and corrections the Toronto Corpus version and our text differ in a number of readings. When proofreading we have turned to the final published versions of works available to the compilers of the Toronto Corpus only in their manuscript form (Martyrology, Adrian and Ritheus, Solomon and Saturn, etc.).

In 1989 and in early 1990, all the Old English texts in the Helsinki Corpus were proofread by comparing them to the printed editions of the texts. A fair number of discrepancies were found; it was, however, impossible to decide whether they were due to typing errors or due to the fact that the Dictionary of Old English staff had consulted copies of the original manuscripts, and, in a couple of cases, had used editions that were not available to us. This problem was partly solved through correspondence with the Dictionary of Old English staff; moreover, Matti Kilpiö spent the Easter of 1990 in Toronto checking some of the queries. The remaining checks were kindly carried out by Robert Stanton from the Dictionary of Old English.

New discrepancies have been found since then: the normal procedure has been to add "our comment" to such divergent cases to ensure that the user of the Helsinki Corpus is made aware of the difference between the Helsinki Corpus text and the edition available in the Helsinki Corpus files.

Notes

[a] See also http://quod.lib.umich.edu/o/oec/ and http://www.doe.utoronto.ca/.

References

Amos, Ashley Crandell 1980. Linguistic means of determining the dates of Old English literary texts. Cambridge, MA: Medieval Academy of America.

Healey, Antonette diPaolo - Richard L. Venezky 1980. A microfiche concordance to Old English. (Publications of the Dictionary of Old English 1.) Toronto: The Pontifical Institute of Mediaeval Studies.

Nichols, Ann Eljenholm 1964. "Awendan: A note on Ælfric's vocabulary", Journal of English and Germanic Philology 63: 7-13.

Rissanen, Matti 1992. "Computers are useful - for aught I know", in: Fran Colman (ed.), Evidence for Old English (Edinburgh Studies in the English Language 2.), 155-168.

Venezky, Richard L. - Sharon Butler 1985. A microfiche concordance to Old English: The high-frequency words. (Publications of the Dictionary of Old English 2.) Toronto: The Pontifical Institute of Mediaeval Studies.

Wrenn, Charles Leslie 1967 [1980]. A study of Old English literature. London: Harrap.