Basic structure

The JSCC contains 23 transcribed files, each containing one lecture with an average length of about 20 minutes and the corresponding question-and-answer session of about 10 minutes.

The corpus contains just over 100,000 words, 77% coming from the presentations and 23% from the Q&A sessions.

Transcription style

The JSCC was transcribed using the standard MICASE transcription conventions, with one exception. Unique to this corpus, we have double-coded the speaker information: in addition to the standard speaker (S) number, (which corresponds to the order in which speakers appear in each transcript), we have added stable participant (PID) initials (each identifiable speaker is marked in the header with a unique set of initials). This allows investigators, if they wish, to track rhetorical preferences and “speaking styles” of individuals throughout the corpus.

N.B. Not all speakers were able to be identified. When the speaker is unknown, you will not see a PID, but simply an S-number.

The John Swales Conference Corpus

Basic structure

Transcription style