Basic structure of EMMA
Early-Modern Multiloquent Authors consists of 50 carefully selected authors across 5 generations. The following graph gives an overview of the corpus distribution.
Selection criteria included
- prolific writers: min. 500,000 words/author
- career & distribution: long career with sufficient material across career stages
- London-based elite
- social network information: connections within and across generations
- religion: Church of England vs Non-Conformists; Quakers
- politics: Royalists vs Parliamentarians
- professions: clergy, politicians, dramatists/authors, philosophers, scientists
Genre balance was not a primary criterion. However, the corpus contains considerable amounts of text from the predominant written genres of the 17th century. The following is a table of those genres that are represented by at least 50,000 words in every generation for the first four generations in EMMA phase I.
|
Generation 1 |
Generation 2 |
Generation 3 |
Generation 4 |
biography |
261633 |
181488 |
555232 |
142332 |
dialogue |
445454 |
71760 |
350166 |
513010 |
drama |
502288 |
719778 |
865389 |
700749 |
fiction |
125605 |
484891 |
308529 |
400477 |
letters |
1106451 |
475776 |
1297755 |
1034385 |
poetry |
585063 |
286585 |
184920 |
341774 |
prose |
23425330 |
4545774 |
13417412 |
2588806 |
science |
1359196 |
1518364 |
2645128 |
212332 |
sermons |
1596864 |
1579549 |
3023848 |
1348144 |
other genres |
4811100 |
1217321 |
1152164 |
2844147 |
A more detailed description of the corpus is being prepared for submission to the ICAME journal.
|