The York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE)

This 1.5 million word, syntactically annotated corpus of Old English prose is based on the Toronto Dictionary of Old English Corpus and contains all the major Old English prose works. Each word is tagged for part of speech, and detailed clause structure is represented by labelled bracketing within a system which relates directly to generative models (representing, e.g., wh-movement and empty pronominal subjects) but which is adapted to the needs of searchers, and avoids abstraction which is unnecessary from this perspective. The corpus can thus be searched automatically for syntactic structure, constituent order and lexical items, using any search engine which will search Penn Treebank format, including CorpusSearch, a search engine written specifically with linguists in mind. There are substantial manuals which set out details of the parsing scheme and explain the use of CorpusSearch, including self-instruction materials.

Project leaders: Anthony Warner & Susan Pintzuk
Time of compilation: 2000–2003
Size: 1,5 million words
Language: Old English
Number of texts/samples: 100
Period: Old English
Released: 2003
Funding: Arts and Humanities Research Board
Project home page:

Reference line and copyright

Taylor, A., A. Warner, S. Pintzuk, and F. Beths. (2003). The York-Toronto-Helsinki Parsed Corpus of Old English Prose. Electronic texts and manuals available from the Oxford Text Archive.

Many of the texts in the corpus are subject to copyright restrictions. The authors hold copyright in the annotations, and freely grant users permission to reproduce the annotations in the course of non-commercial scholarly activity.


Ann Taylor (2003)


Professor Anthony Warner (Project leader)
Professor Susan Pintzuk (Co-Project leader)
Dr. Ann Taylor
Frank Beths


Electronic texts and manuals freely available from the Oxford Text Archive (text 2462) for research purposes. Manuals and text information online at

Associated projects

The Dictionary of Old English Project (DOE)
The Helsinki Corpus of English Texts (HC)
The York-Helsinki Parsed Corpus of Old English Poetry
The Penn-Helsinki Parsed Corpus of Middle English 2 (PPCME2)
The Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME)
The Parsed Corpus of Early English Correspondence (PCEEC)