Basic structure

The files in the ZEN Corpus correspond to individual newspaper issues.

Parameter and other coding

The CD-ROM contains two TEI conformant XML versions encoded in UTF-8 and ASCII. It also contains a text-only version intended for use with software that is not XML-aware.

Coding for years, individual newspaper issues, domain or text classes (foreign news, home news, advertisements, etc.).

The corpus is divided into the following time periods:

  • up to 1665
  • 1671–1691
  • 1701–1721
  • 1731–1751
  • 1761–1781
  • after 1785

Genre covered by corpus

All major English newspapers of the late 17th and the 18th century published in London, including

  • The London Gazette, which constitutes about half of the corpus

Thrice-weekly papers

  • The London Post
  • The English Post
  • The Post Man
  • The Post Boy
  • The Flying Post

Daily newspapers

  • The Daily Courant
  • The Daily Post

Evening papers

  • The Evening Post
  • The St. James's Evening Post

"Tabloids"

  • The Morning Post

Advertising papers

  • The Daily Advertiser
  • The Champion or Evening Advertiser
  • The General Advertiser
  • The Gazetteer and New Daily Advertiser
  • The London (Daily) Advertiser (and Literary Gazette)
  • The London Daily Post and General Advertiser
  • The London Morning Advertiser
  • The Public Advertiser