Basic structure of the Buckeye Corpus
- 2–5 audio files per speaker
- 40 speakers
- 19 hours of phonetically tagged speech
- hand corrected phonetic tags, using a superset of ARPABET
Genre
Conversational speech, collected with deception, talkers were told that the recording session was a part of a focus group on local issues.
Sociolinguistic coverage
Age and gender stratified sample. No controls over many sociolinguistic variables – respondents answered ads in local papers.
|