Michigan Corpus of Academic Spoken English (MICASE)

The Michigan Corpus of Academic Spoken English (MICASE) is a collection of nearly 1.8 million words of transcribed speech (almost 200 hours of recordings) from the University of Michigan (U-M) in Ann Arbor, created by researchers and students at the U-M English Language Institute (ELI). MICASE contains data from a wide range of speech events (including lectures, classroom discussions, lab sections, seminars, and advising sessions) and locations across the university.

Project leader: Dr. Ute Römer, University of Michigan
Time of compilation: 1997–2002
Size: 1.8 million words
Language: English
Number of texts/samples: 152 (over 190 hours)
Period: 1997–2001
Released: 2002; new interface 2007
Funding: English Language Institute, University of Michigan
Project home page: http://quod.lib.umich.edu/m/micase/

Reference lines and copyright

Simpson, R. C., S. L. Briggs, J. Ovens, and J. M. Swales. (2002) The Michigan Corpus of Academic Spoken English. Ann Arbor, MI: The Regents of the University of Michigan.

MICASE is owned by the Regents of the University of Michigan, who hold the copyright. The database has been developed by the English Language Institute, and the web interface by Digital Library Production Services. The original DAT audiotapes are held in the English Language Institute and may be consulted by bona fide researchers under special arrangements. The database is freely available at the MICASE website for study, teaching and research purposes, and copies of the transcripts may be distributed, as long as either this statement of availability or the citation given below appears in the text. However, if any portion of this material is to be used for commercial purposes, such as for textbooks or tests, permission must be obtained in advance and a license fee may be required. Furthermore, some restrictions apply on the citation of specific portions of some of the transcripts in educational presentations and publications; all such restrictions are noted in the headers of individual files of the corpus. For further information about copyright permissions, please contact Dr. Ute Römer at elicorpora@umich.edu.

Manual

Available as PDF; the MICASE website also contains online tutorials.

Compilers

Rita Simpson-Vlach (project manager 1997 to 2006); John Swales (faculty advisor); Sarah Briggs (testing advisor)

For current and former MICASE team members, see the Michigan Corpus Linguistics website and the MICASE history page at http://micase.elicorpora.info/about-micase.

Availability

The entire corpus is accessible on the web, on a site with a searchable interface much like a concordance program: http://quod.lib.umich.edu/m/micase/

A version of MICASE became available via CD-ROM or downloadable zip file in July 2003 for a nominal fee (the order form can be downloaded from http://micase.elicorpora.info/purchase-micase-materials/purchase-micase-transcripts). This version is different in some ways from the on-line version:

  • Only the distributed version of MICASE comes with the DTD (Document Type Definition), which specifies the elements, attributes, entities, and notations used in the XML transcripts.
  • The web version contains extraneous mark-up that speeds up the on-line search engine.
  • The speech for some speakers has been hidden/deleted, as per consent restrictions, in the web version.

Technical information

All files are in XML format.

Associated projects

MICUSP

JSCC