Basic structure of the corpus
All four corpora belonging to the core of the ‘Brown family’, i.e. the 1960s Brown and LOB corpora and their 1990s updates Frown and F-LOB, follow the same structure of textual genres. All corpora of the ‘Brown family’ consist of 500 texts of about 2000 words each, giving a total of around one million words per corpus. The table below lists the genres included in the corpora.
Table 1. Text categories in the Brown family of matching 1-million-word corpora of written StE (Source: Manual of the POS-tagged ‘Brown’ corpora)
Genre group |
Category |
Content of category |
No. of texts |
Press (88) |
A |
Reportage |
44 |
|
B |
Editorial |
27 |
|
C |
Review |
17 |
General Prose (206) |
D |
Religion |
17 |
|
E |
Skills, trades and hobbies |
36 |
|
F |
Popular lore |
48 |
|
G |
Belles lettres, biographies, essays |
75 |
|
H |
Miscellaneous |
30 |
Learned (80) |
J |
Science |
80 |
Fiction (126) |
K |
General fiction |
29 |
|
L |
Mystery and detective Fiction |
24 |
|
M |
Science fiction |
6 |
|
N |
Adventure and Western |
29 |
|
P |
Romance and love story |
29 |
|
R |
Humor |
9 |
Total |
|
|
500 |
Figure 1. Text categories in the Brown family of corpora.
|