| 
              
              
              
                | Corpus Finder
                    To sort corpora according to any attribute, click on the appropriate column header.Use the filters to view  a specific selection of corpora.For explanations of the table categories, see below.  
				   
                    
                      | Corpus | Start | End | Periods | Word Count | Text Samples | Spoken/ Written
 | Annotation | Format | Availability |  
                      | ALEC - Advanced Learner English Corpus | 2004 | 2013 | PDE | 1,300,000 | 146 | Written | None |  | Not available |  
                      | APU - APU Writing and Reading Corpus 1979-1988 | 1979 | 1988 | PDE | 172,000 | 543 | Written | 
 | Online | Free subscription |  
                      | ARCHER - A Representative Corpus of Historical English Registers | 1600 | 1999 | EModE LModE
 PDE
 |  |  |  | 
 | On-site Online
 |  |  
                      | BASE - British Academic Spoken English Corpus | 2000 | 2005 | PDE |  |  | Spoken |  | Download | Free subscription |  
                      | BAWE - British Academic Written English Corpus | 2000 | 2007 | PDE | 6,506,995 | 2761 | Written | Tagging Other
 | Download | Free subscription |  
                      | BE06 - The British English 2006 corpus | 2003 | 2008 | PDE | 1,010,996 | 500 | Written | Tagging 
 | Online | License required |  
                      | BLOB-1931 - The BLOB-1931 Corpus | 1928 | 1934 | PDE | 1,000,000 | 500 | Written | Tagging None
 |  | In preparation |  
                      | BNC - British National Corpus |  |  | PDE | 100,000,000 |  | Written & Spoken | Tagging Other
 | Download | Free subscription |  
                      | BROWN - A Standard Corpus of Present-Day Edited American English | 1961 | 1961 | PDE | 1,000,000 | 500 | Written | Tagging Other
 None
 | CD | License required |  
                      | B-BROWN - The 1930s Brown Corpus | 1928 | 1934 | PDE | 1,000,000 | 500 | Written | Tagging Parsing
 None
 Other
 | On-site | In preparation |  
                      | Buckeye Corpus | 1998 | 2000 | PDE | 300,000 | 40 | Spoken | Other | Download | Free subscription |  
                      | CASE - Corpus of Academic Spoken English | 2012 |  | PDE |  | 300 | Spoken | Tagging Other
 | Online | In preparation |  
                      | CC - The Coruña Corpus of English Scientific Writing | 1700 | 1900 | LModE |  |  | Written |  | CD Download
 | Open access |  
                    | CC: CETA - A Corpus of English Texts on Astronomy | 1700 | 1900 | LModE | 409,909 | 42 | Written | Other | CD Download
 | Open access |  
                    | CC: CEPhiT - A Corpus of English Philosophy Texts | 1700 | 1900 | LModE | 400,416 | 40 | Written | Other | CD Download
 | Open access |  
                    | CC: CHET - A Corpus of History English Texts | 1700 | 1900 | LModE | 404,311 | 40 | Written | Other | Download | Open access |  
                    | CC: CELiST - A Corpus of English Life Sciences Texts | 1700 | 1900 | LModE | 400,305 | 40 | Written | Other |  | In preparation |  
                    | CC: Women Scientists - A Corpus of Women Scientists | 1700 | 1930 | LModE PDE
 |  | 40 | Written |  |  | In preparation |  
                      | CED - A Corpus of English Dialogues 1560-1760 | 1560 | 1760 | EModE | 1,183,690 | 177 | Written & Spoken | Other | CD Download
 | License required |  
                      | CEEC - Corpus of Early English Correspondence | 1402 | 1800 | ME EModE
 LModE
 | 5,100,000 | 12,000 | Written | Other | On-site | Free subscription |  
                      | CEEC - Corpus of Early English Correspondence /  1998 version | 1410 | 1681 | ME EModE
 | 2,597,795 | 5,961 | Written | Other | On-site | Free subscription |  
                      | CEECE - Corpus of Early English Correspondence Extension | 1653 | 1800 | EModE LModE
 | 2,219,422 | 4,923 | Written | Other | On-site | Free subscription |  
                      | CEECE: TCEECE - Tagged Corpus of Early English Correspondence Extension | 1653 | 1800 | EModE LModE
 | 2,219,422 | 4,923 | Written | Tagging Other
 | On-site | Free subscription |  
                      | CEECES - Corpus of Early English Correspondence Extension Sampler | 1653 | 1800 | EModE LModE
 | 1,140,286 | 2,624 | Written | Tagging Other
 | Download | Open access |  
                      | CEECSU - Corpus of Early English Correspondence Supplement | 1402 | 1663 | ME EModE
 | 442,484 | 829 | Written | Other | On-site | Free subscription |  
                      | CEECS - Corpus of Early English Correspondence Sampler | 1418 | 1680 | ME EModE
 | 450,085 | 1,123 | Written | Other | CD Download
 | Free subscription |  
                      | CEEC: PCEEC - Parsed Corpus of Early English Correspondence | 1410 | 1681 | ME EModE
 | 2,159,132 | 4,970 | Written | Tagging Parsing
 Other
 | Download | Free subscription |  
                      | CEEC: PCEEC2 - Parsed Corpus of Early English Correspondence, 2nd edition | 1410 | 1681 | ME EModE
 | 2,159,132 | 4,970 | Written | Parsing Other
 | Download | Open access |  
                      | CEEM - Corpus of Early English Medical Writing | 1375 | 1800 | ME EModE
 LModE
 | 4,500,000 | 1,164 | Written | Other | CD | Commercial |  
                      | CEEM: MEMT - Middle English Medical Texts | 1375 | 1500 | ME | 495,322 | 86 | Written | Other | CD | Commercial |  
                      | CEEM: EMEMT - Early Modern English Medical Texts | 1500 | 1700 | EModE | 2,000,000 | 450 | Written | Other | CD | Commercial |  
                      | CEEM: LMEMT- Late Modern English Medical Texts | 1700 | 1800 | LModE | 2,000,000 | 628 | Written | Other | CD | Commercial |  
                      | CHELAR - Corpus of Historical English Law Reports 1535-1999 | 1535 | 1999 | EModE LModE
 PDE
 | 463,009 | 369 | Written | Tagging None
 | Download | Open access |  
                      | CIE - A Corpus of Irish English 14th–20th c. | 14c | present | ME EModE
 LModE
 PDE
 |  | 70 | Written | Other | CD | From compiler |  
                      | CLEP - Corpus of Late 18th c. Prose | 1761 | 1790 | LModE | 300,000 | 1827 | Written | None | Download | Free subscription |  
                      | CLOB - A Brown Family Corpus of Written British English | 2008 | 2011 | PDE | 1,000,000 | 500 | Written |  | Download | Open access |  
                      | CLMETEV - The Corpus of Late Modern English Texts | 1710 | 1920 | LModE | 15,000,000 | 176 | Written | None | Download | Free subscription |  
                      | CMEPV - Corpus of Middle English Prose and Verse |  |  | ME |  | 62 | Written |  | Online | Open access |  
                      | CMSW - Corpus of Modern Scottish Writing | 1700 | 1945 | LModE | 5,500,000 | 62 | Written |  | Online |  Open access |  
                      | CNNE -  Corpus of Nineteenth-century Newspaper English | 1830 | 1895 | LModE | 320,000 | 200 | Written | None | Onsite | Not available |  
                      | COCA - Corpus of Contemporary American English | 1990 | 2009 | PDE | 520,000,000 | - | Written & Spoken | Tagging | Online | Free subscription |  
                      | CoCELD - Corpus of Contemporary English Legal Decisions, 1950–2021 | 1950 | 2021 | PDE | 733,227 | 288 | Written | Tagging | Download | Free subscription |  
                      | CoER -  Corpus of Early English Recipes | 1375 | 1900 | ME EModE
 LModE
 | 1,500,000 | 150 | Written |  |  | In preparation |  
                      | COERP - Corpus of English Religious Prose | 1150 | 1800 | ME EModE
 LModE
 |  |  | Written | Other | - | In preparation |  
                      | COHA -  Corpus of Historical American English | 1810 | 2009 | LModE PDE
 | 4,000,000 | 100,000 | Written | Tagging | Online | Open access |  
                      | COLMOBAENG -  Corpus of Late Modern British and American English Prose | 2006 | 2007 | PDE | 1,170,000 | 173 | Written | None | Download | Open access |  
                      | CoNE -  Corpus of Narrative Etymologies | 1150 | 1325 | EME |  |  | Written | Tagging Other
 | Online | Open access |  
                      | CONTE-pC -  Corpus of Early Ontario English, pre-Confederation Section | 1776 | 1849 | LModE | 125,000 |  | Written |  |  | From compiler |  
                      | CONTRAST-IT | 2011 | 2015 | PDE | 300,000 |  | Written | Tagging | Online | Open access |  
					  | COOEE - Corpus of Oz Early English | 1788 | 1900 | LModE | 2,000,000 | 1353 | Written | Other | - | Free subscription |  
                      | CoSiB -  Corpus of Singaporean Blogs | 2006 | 2010 | PDE | 200,000 | 100 | Written |  |  | From compiler |  
                      | CoWITE - Corpus of Women’s Instructive Texts in English | 1550 | 1899 | EModE LModE
 | 1,750,000 |  | Written | Tagging Other
 | Online Download
 | In preparation |  
                      | CoWITE16 - Corpus of Women’s Instructive Texts in English (1550–1599) | 1550 | 1599 | EModE | 250,000 |  | Written | Tagging Other
 | Online Download
 | In preparation |  
                      | CoWITE17 - Corpus of Women’s Instructive Texts in English (1600–1699) | 1600 | 1699 | EModE | 500,000 |  | Written | Tagging Other
 | Online Download
 | In preparation |  
                      | CoWITE18 - Corpus of Women’s Instructive Texts in English (1700–1799) | 1700 | 1799 | LModE | 500,000 |  | Written | Tagging Other
 | Online Download
 | Free subscription Open access
 |  
                      | CoWITE19 - Corpus of Women’s Instructive Texts in English (1800–1899) | 1800 | 1899 | LModE | 500,000 |  | Written | Tagging Other
 | Online Download
 | Free subscription Open access
 |  
                      | CROWN -  Crown Corpus | 2008 | 2011 | PDE | 1,026,226 | 500 | Written | Tagging Parsing
 | Online | Open access |  
                      | CSC - Corpus of Scottish Correspondence | 1500 | 1715 | EModE | 256,300 | 719 | Written | Other | Online | In preparation |  
                      | DCPSE - Diachronic Corpus of Present-Day Spoken English | 1958 | 1992 | PDE | 800,000 | 280 | Spoken | Tagging Parsing
 Other
 | CD | Commercial |  
                      | DECTE - Diachronic Electronic Corpus of Tyneside English | 1960 | 2000 | PDE | 804,266 | 99 | Spoken |  | Download DVD
 | From compiler |  
                      | DOEC - Dictionary of Old English Corpus | 600 | 1150 | OE | 4,000,000 | 3060 | Written | Other | CD | License required |  
                      | ELFA - English as a Lingua Franca in Academic Settings | 2001 | 2008 | PDE | 1,010,834 | 165 | Written & Spoken |  | CD | License required |  
                      | ENPC - The English-Norwegian Parallel Corpus | 1975 | 1995 | PDE | 2,600,000 | 100 | Written | Tagging Other
 | On-site | Free subscription |  
                      | FLOB - The Freiburg-Lancaster-Oslo/Bergen Corpus | 1992 | 1992 | PDE | 1,000,000 | 500 | Written | Tagging None
 | CD | License required |  
                      | FRED - Freiburg Corpus of English Dialects | 1970 | 1999 | PDE | 1,011,396 | 121 | Spoken | Other | On-siteCD
 | Free subscription |  
                      | FROWN - The Freiburg-Brown Corpus | 1991 | 1991 | PDE | 1,000,000 | 500 | Written | Tagging None
 | CD | License required |  
                      | Google Books - Google Books Corpora | 1500 | 2009 | EModE LModE
 PDE
 | 2000,000,000,000 |  | Written | Tagging Other
 | Online | Open access |  
					   | HARES - Helsinki Corpus of Regional English Speech | 1970 | 1980 | PDE |  |  | Spoken | Other | Download | License required |  
                      | HC - Helsinki Corpus | 730 | 1710 | OE ME
 EModE
 | 1,572,800 | 450 | Written | Other | CD Download
 | License required |  
                      | HCOS - Helsinki Corpus of Older Scots | 1450 | 1700 | EModE | 834,200 | 71 | Written | Other | CD | License required |  
                      | HD - Helsinki Corpus of British English Dialects | 1970 | 1985 | PDE | 1,008,641 | 187 | Spoken | Other | On-site | Free subscription |  
                      | HUM19UK - HUM19UK Corpus | 1800 | 1899 | LModE | 13,000,000 | 100 | Written | Other | Download | Open access |  
                      | ICE - International Corpus of English |  |  |  |  |  | Written & Spoken | Tagging Other
 | Download CD
 | Free subscription |  
					   | ICE-GB: International Corpus of English - The British component | 1990 | 1993 | PDE | 1,061,264 | 500 | Written & Spoken | Tagging Parsing    | CD | License required |  
					   | ICE-GBR: International Corpus of English - Gibraltar | 2000 | 1993 | PDE | 1,000,000 |  | Written & Spoken |  |  | In preparation |  
					   | ICE-NIG: International Corpus of English - Nigeria | 2000 |  | PDE | 1,000,000 | 902 | Written & Spoken | Tagging | Download | Open access |  
					   | ICE-SCO: International Corpus of English - Scotland | 2013 | 2016 | PDE | 1,000,000 |  | Written & Spoken | Tagging | Download | In preparation |  
                      | ICoMEP - Innsbruck Corpus of Middle English Prose |  |  | ME | 7,800,000 | 129 | Written |  | CD | From compiler |  
					   | JSCC - The John Swales Conference Corpus | 2006 | 2006 | PDE | 100,000 | 23 | Spoken | None | Download | Open access |  
                      | LAEME - A Linguistic Atlas of Early Middle English | 1150 | 1325 | ME | 816,170 | 167 | Written | Other | Online | Open access |  
                      | eLALME - A Linguistic Atlas of Late Mediaeval English | 1150 | 1325 | ME |  |  | Written | Other | Online | Open access |  
                      | LAMSAS - A Linguistic Atlas of the Middle and South Atlantic States | 1933 | 1974 | PDE |  |  | Spoken | Other | Online | Open access |  
                      | LC - The Lampeter Corpus of Early Modern English Tracts | 1640 | 1740 | EModE | 1,193,385 | 120 | Written | Other | CD Download
 | License required |  
					   | LLC - The London-Lund Corpus of Spoken English | 1953 | 1987 | PDE | 500,000 | 100 | Spoken | Other | CD | License required |  
                      | LOB - The Lancaster-Oslo/Bergen Corpus | 1961 | 1961 | PDE | 1,000,000 | 500 | Written | Tagging None
 | CD | License required |  
                      | MCEESP - The Málaga Corpus of Early English Scientific Prose | 1350 | 1900 | ME EModE
 LModE
 | 6,000,000 |  | Written | Tagging | OnlineDownload
 | In preparation |  
                      | MCLMESP - The Málaga Corpus of Late Middle English Scientific Prose | 1350 | 1500 | ME | 1,500,000 |  | Written | Tagging Other
 | Download | Open access |  
                      | MCEModESP - The Málaga Corpus of Early Modern English Scientific Prose | 1500 | 1700 | EModE | 1,500,000 |  | Written | Tagging | Online Download
 | Free subscription |  
                      | MCLModESP - The Málaga Corpus of Late Modern English Scientific Prose | 1700 | 1900 | LModE | 3,000,000 |  | Written | Tagging | Online Download
 | In preparation |  
					   | MEG-C - The Middle English Grammar Corpus | 1350 | 1500 | ME | 450,000 | 320 | Written | Other | Download | Open access |  
                      | MICASE - Michigan Corpus of Academic Spoken English | 1997 | 2001 | PDE | 1,800,000 | 152 | Spoken | Other | CD Download
 Online
 | Open access |  
                      | MICUSP - Michigan Corpus of Upper-level Student Papers | 2002 | 2009 | PDE | 2,600,000 | 829 | Written | Other | Online | Open access |  
                      | MOECS - Corpus of Multilingual Opinion Essays by College Students | 2007 | 2016 | PDE |  | 477 | Written |  | Download | Free subscription |  
                      | NECTE - Newcastle Electronic Corpus of Tyneside English | 1969 | 1994 | PDE |  | 62 | Spoken | Other | Download DVD
 | Free subscription |  
					   | OBC - Old Bailey Corpus | 1720 | 1913 | EModE LModE | 14,000,000 |  | Spoken | Tagging | Online | Free subscription |  
                      | PPCEME - The Penn-Helsinki Parsed Corpus of Early Modern English | 1500 | 1710 | EModE | 1,794,010 | 229 | Written | Tagging Parsing
 None
 | CD | License required |  
                      | PPCMBE - The Penn-Helsinki Parsed Corpus of Modern British English | 1700 | 1914 | LModE PDE
 | 948,895 | 101 | Written | Tagging Parsing
 None
 | CD | License required |  
                      | PPCME2 - The Penn-Helsinki Parsed Corpus of Middle English, 2nd edition | 1150 | 1500 | ME | 1,155,965 | 55 | Written | Tagging Parsing
 None
 | CD | License required |  
                      | PWEC - Pakistan Written English Corpus | 2020 | 2023 | PDE | 7,586,110 | 4,158 | Written | None |  | From compiler |  
                      | QHC - Quaker Historical Corpus | 1650 | 1699 | EModE | 722,370 | 173 | Written | None | Online | Open access |  
                      | RCN1 - Rostock Newspaper Corpus | 1700 | 2000 | LModE PDE
 | 600,000 |  | Written |  | On-site |  |  
					   | SC - Salamanca Corpus. Digital Archive of English Dialect Texts | 1500 | 1950 | EModE LModE | 6,115,267 |  | Written | Tagging | Online |  |  
					   | SCEPA - Small Corpus of English Political Apologies | 1950 | 2017 | PDE | 22,538 | 232 | Written & Spoken | Other | Download | Open access |  
					   | SCoCESLE - Small Corpus of Colombian English as a Second Language Essays | 2022 | 2023 | PDE | 81,994 | 272 | Written | None | Download | Open access |  
                      | SCONE - Seville Corpus of Northern English | 600 | 1590 | OE ME
 |  |  | Written | Other | Download | Open access |  
                      | SCOTS - Scottish Corpus of Texts & Speech | 1945 | 2007 | PDE | 4,000,000 | 1177 | Written & Spoken | Other None
 | Online | Open access |  
					   | SCPS - Small Corpus of Political Speeches | 1789 | 2010 | PDE | 655,479 | 239 | Written & Spoken | Tagging | On-site | License required |  
					   | TaCoCASE - Transatlantic Component of the Corpus of Academic Spoken English | 2016 | 2023 | PDE | 140,003 | 15 | Spoken | Other | Online Download
 | Free subscription |  
					   | TIME - TIME corpus | 1923 | 2009 | PDE | 100,000,000 | 275,000 | Written | Tagging | Online | Free subscription |  
					   | ViMELF - Corpus of Video-Mediated English as a Lingua Franca Conversations | 2012 | 2015 | PDE | 152,472 | 20 | Spoken | Tagging Other
 None
 | Download | Free subscription |  
                      | VOICE - Vienna-Oxford International Corpus of English | 2000 | 2007 | PDE | 1,023,043 | 151 | Spoken | Other | Online | Free subscription |  
					   | WestLabUSENET - Reduced redundancy USENET corpus | 2005 | 2011 | PDE | 6,089,697,986 | 22,799,995 | Written | None | Download | Open access |  
                      | YCCQA - Yahoo-based Contrastive Corpus of Questions and Answers | 2006 | 2009 | PDE | 29,400,000 | 665,000 | Written | Other | Download | Free subscription |  
					   | YCOE - The York-Toronto-Helsinki Parsed Corpus of Old English Prose | - | - | OE | 1,500,000 | 100 | Written | Tagging | Download | Free subscription |  
                      | ZEN - Zurich English Newspaper corpus | 1661 | 1791 | EModE LModE
 | 1,600,000 | 349 | Written | Other | Online CD
 | Free subscription |    Corpus Finder categories 
                    
                      |  Corpus  |  Some corpora consist of subcorpora (CEEC, CEEM). In these cases both the entire corpus and the subcorpora have been listed; the subcorpora are indented.  |  
                      |  Start, End, Periods  |  The period labelling follows roughly the categorisation below unless a particular period is specified in the name of the corpus.  |  
                      |  OE  |  Old English c. -1300  |  
                      |  ME  |  Middle English c. 1300-1500  |  
                      |  EModE  |  Early Modern English c. 1500-1700  |  
                      |  LModE  |  Late Modern English c. 1700-1900  |  
                      |  PDE  |  Present Day English 1900-  |  
                      |  Word count, Text samples  |  Left empty when the word count or number of text samples is unknown.  |  
                      |  Spoken/Written  |  Shows whether the corpus material is from written sources, recorded speech or both.  |  
                      |  Annotation  |  Tagging  | Part-of-speech annotation   |  
                      |  Parsing  | Syntactic annotation   |  
                      |  Other  | Annotation of, e.g., discursive features, text structure,phonetic features, orthography, etc.
 |  
                      |  None  |    |  
                      |  Format  |  CD/DVD  |  The corpus is distributed on a disc.  |  
                      |  Download  |  The corpus can be downloaded from the internet. |  
                      |  Online  |  The corpus is accessible online without downloading.  |  
                      |  On-site  |  The corpus can only be accessed locally.  |  
                      |  Availability  |  Open access  |  The corpus can be freely used by anyone.  |  
                      |  Free subscription  |  The corpus is free to use but requires a subscription.  |  
                      |  Licence required  |  A paid subscription is required.  |  
                      |  Commercial  |    |  
                      |  In preparation  |    |  
                      |  Not available  |  The corpus is not available to external users  for copyright reasons.  |   Javascript for the Corpus Finder table by Max Guglielmi (http://tablefilter.free.fr/). |  |  
                
                
               |