Editorial Policy in MEMT
(Text extracted and adapted from "Introduction to MEMT", on Middle English Medical Texts (2005) CD-ROM.)
The editorial policy in MEMT is motivated by the research purposes for which the corpus was originally designed, i.e. the study of changing thought-styles in the history of medical science. Most texts in the corpus have been processed from modern editions of medieval manuscript texts. Two of Caxton's texts, Ars moriendi and Gouernayle of Helthe, are based on early printed books, and seven texts have been transcribed directly from manuscripts (see section 2.2). The texts have been rendered in computer-readable form by keying in or OCR scanning (optical character recognition). All texts have been proofread, corrected and checked three times, but it is inevitable that some inconsistencies may remain in the texts even after these measures.
The fact that MEMT, like all other historical corpora available at present, is based on editions means that it should not be used as a sole basis of research for tasks that require systematic reproduction of manuscript accidentals (see below). Another implication is that the texts contain two layers of editorial intervention. The first layer consists of the editorial practices adopted by the original editors converting the handwritten manuscript texts into print. The second layer consists of measures taken by MEMT compilers in converting printed editions into electronic texts. The following sections discuss both editorial layers.
Editorial Principles
In processing editions into corpus texts, we have not aimed at levelling the variation in the original editorial practices reflected in our source texts. We have, however, aimed at reproducing the texts in the editions faithfully. At the same time we have formatted the texts systematically in two ways according to the generally accepted principles of corpus compilation, even if some of the actual practices we have adopted are different from other corpora. Firstly, as in other historical corpora, some information traditionally contained in text editions has been eliminated in the conversion into corpus texts. For example, the corpus files only contain the body text presented in the editions, i.e. we have omitted the textual notes and the critical apparatus, including variant readings in other manuscripts; to obtain this information, MEMT users are referred to the original editions. Unlike in some historical corpora, changes of font in the source texts are not coded in the electronic texts, so that e.g. items italicized in the editions appear in basic Roman type in the electronic texts. Secondly, we have encoded the texts using a corpus mark-up scheme to provide information about the text, to compensate for the loss of information in the transfer to electronic format, and to manage further processing. For example, some crucial information given by the editors in the notes, e.g. on changes of scribal hand or source manuscript, has been incorporated into the texts as comments that have been encoded accordingly. Because of the layered nature of the corpus data, the mark-up scheme includes two types of editorial comments: comments by the original editor concerning the processing of the text from a manuscript to a text edition, and our own comments that concern the processing of a text edition into a computer-readable corpus text.
Short title
Each text has been assigned a short title, which appears at the beginning of the electronic text file, indicated by "our comment" code [^…^] (see section 4.1.14). The same short title is used as the identification of the text in the MEMT Presenter file tree, in the pop-up card and the catalogue of texts containing background information about the text. Original titles, incipits or headings reproduced in the edition are also included (see section 4.1.2).
Headings and subheadings [}…}]
Headings and subheadings in the source text have been retained and coded using a special "heading" code [}…}].
Page and folio numbers
The page numbers of the source text are indicated in the electronic text. Each page number is preceded by the code |P_. Thus, for example, p. 54 in an edition is presented as |P_54 in the electronic text. Original manuscript folio/page numbers are reproduced from the editions; they are indicated by the "editor's comment" code [\...\]. For example, the verso of folio 5 indicated by the editor appears as [\f. 5v\] in the electronic text. Not all editors have indicated folio numbers, and electronic texts based on such source texts do not contain folio numbers.
Word division
Word division has been reproduced from the source texts. For subsequent indexing and concordancing, parts of words divided by a hyphen at the end of lines have been joined, and the end of the word has been moved to the same line as the beginning (e.g. phy-#sicke physicke #, where # indicates line division). Compounds divided by a hyphen at the end of lines have been joined, and the second part of the compound has been moved to the same line as the beginning (e.g. ther-#of ther-of #). Compounds written as two separate words in the source text have not been joined (e.g. to#guyder to#guyder). Two words typed as one word in the source text have not been separated (e.g. thend thend 'the end'). Capitalized words or parts of words e.g. at the beginning of paragraphs have been retained. For abbreviations and superscripts, see sections 4.1.8 and 4.1.9.
Line division
The main line division of the source text has been preserved, i.e. a new line in the source text begins a new line in the electronic text. For word division at the end of lines, see section 4.1.4 above.
Paragraph division
A new paragraph is preceded by an empty line, which is inserted also when the paragraph change co-occurs with a page change. Paragraph-initial strings of spaces in the original have been ignored so that the text always begins from the left margin. Capitalized words or parts of words at the beginning of paragraphs have been retained.
Punctuation
The punctuation of the source text has been reproduced as far as possible. The main punctuation marks are as follows:
! = exclamation mark
, = comma
. = period
/ = slash or virgule
: = colon
; = semicolon
? = question mark
- = hyphen or dash
΄ = single quote or apostrophe
( ) = brackets (reproduced from source texts)
< > = diamond brackets (indicating words inserted between the lines or in the
margin in source texts)
The slash and double slash (//) may occur as clause separators or in fractions (e.g. 1/2). Slashes indicating manuscript line division in the source text have been ignored. The period (.) may occur with numerals as well (e.g. .x. 'ten'). Hyphens and dashes used in sentential functions in the source text have been keyed in as hyphens, preceded and followed by a space (this type of hyphen may occur in a line-end position, see section 4.1.4).
In some source texts, e.g. in some of the Chauliac editions by Björn Wallner, punctuation marks added by the editor have been placed in brackets indicating editorial emendation, e.g. [,]. In the electronic version the brackets surrounding editorial punctuation marks have been omitted.
Abbreviations
Abbreviations expanded and italicized in the source texts appear in Roman type in the electronic texts. Abbreviations indicated by a tilde (~) above or through the letter(s) have been coded with the letter followed by a tilde (~).
Superscript (=)
Letters printed in superscript in the source text are preceded and followed by '=' (e.g. wt w=t= 'with').
Special characters 'thorn' (Þ, þ) and 'yogh' (3)
The letter 'thorn' is reproduced from the source texts; both upper- and lower-case forms are used (Þ, þ). The letter 'yogh' in the source text, both upper- and lower-case, is keyed in as the numeric character 3.
Symbols
Special symbols in the source texts have been coded using the following representations:
+n q = 'quarter'
+o = 'ounce'
+Q = 'drachm'
+q = 'scruple'
+R = 'recipe'
+s = 'semis' (ss)
+ + = 'cross'
Emendation [{…{]
Editorial emendations in the source text have been retained and coded in the electronic text using an "emendation" code [{…{].
Editor's comment [\…\]
The code for "editor's comment" [\…\] is used for indicating information added by the editor, such as folio or page numbers in the manuscript, headings and sub-headings that are not in the original, item numbering, comments on the hand, and changes between source manuscripts. Note that changes in the hands or source manuscripts are not indicated by all editors.
Our comment [^…^]
"Our comment" code [^…^] is used for indicating the short title assigned to each text (see section 4.1.1). It is also used for indicating omissions of extra-textual material in the text (see section 4.1.15).
Material omitted
(a) Extra-textual material in the text (titles, tables, diagrams, pictures, signs of the zodiac etc.) has been excluded; explanatory comments on omissions have been added when deemed necessary.
(b) All typographical shifts (italics, bold face, etc.) in the source texts have been ignored, including italicized expansions of abbreviations, which occur repeatedly in the texts.
(c) Changes of language in the source text have not been coded. All foreign language segments have been included, except for passages in non-Latin alphabet, which have been omitted and coded accordingly, using the "our comment" code (see section 4.1.14).
(d) Paragraph signs (¶) in the source text have been omitted.
(e) Accents have been ignored.
(f) Marginal comments reproduced in the editions have been omitted.
Mark-up scheme
The following codes are used in the mark-up scheme in MEMT; for further explication of their use, see the relevant sub-sections in Editorial practices, section 4.1 above.
[}...}] = 'heading'
[{...{] = 'emendation'
[\...\] = 'editor's comment' (including manuscript folio numbers)
[^...^] = 'our comment' (including the short title assigned to each text)
|P_### = 'page number (in the edition)'
+ = 'symbol' (in combination with alphabetic characters n, O, Q, q, R, s)
= = 'superscript'
~ = 'abbreviation'
|