Transcriptional Protocols
General Introduction

Note:

The Transcriptional Protocols are under revision. If you are currently preparing an edition that you wish to conform to PPEA/SEENET standards, please contact the Archive to receive a draft copy of the current protocols.

What To Do First

The transcriptional protocols are written more as a reference work than as a continuous narrative, and were not intended to introduce new archivists to the arts of electronic editing. As a result, a great deal of redundancy has been built into many entries, which in some cases will duplicate material elsewhere. The narrative introduction to the protocols linked below is not yet complete, but will soon serve as a means of familiarizing archivists with the general features of the Transcriptional Protocols, as well as with common pitfalls and important caveats culled from long experience.

Conventions Used in This Documentation

Directly under the header of each section containing a direct citation of a TEI-conformant element, or any other type of code or routine with a standard, documented specification, you will also find a callout box with links to the specification itself, in this form:

External Specifications:

Unicode

Generally, these links are intended for advanced users. They will take you not to the basic PPEA/SEENET documentation (which in any case will immediately follow the callout box in question), but rather to the international standard lying behind PPEA/SEENET practice. Following these links will not be necessary for a basic understanding of the Protocols, but rather will serve to inform more advanced discussion of proper usage and questions of the need for extension as these may arise. Even if there is no external link associated with a specification, hovering over its name will invoke a tooltip that will explain briefly what the standard is.

General examples illustrating some aspect of markup application, whether they are images or blocks of code, are contained in boxes with a darker background:

<sic>Dig Before You Call</sic>

<corr>Call Before You Dig</corr>

Specifications of markup such as the standard attributes of a given tag, and the standard values for those attributes, are contained within a similar box, shifted further to the right, with text in list form:

Standard values for the "agent" attribute of the `<dig>` tag:

bulldozer
backHoe
spade
trowel
teaspoon
sableBrush

The values listed in such boxes must be used rigorously in your markup, in exactly the form in which they appear in the bulleted list. Otherwise, later processing and display will be adversely affected. Should you need new attributes or values (or even whole new elements), as you would in this case if you had dug with a toothpick or lancet, you will need to review the TEI discussion on Conformance.

Finally, issues of special importance are highlighted in boxed paragraphs of their own, in this form:

Note:

Paragraphs such as these should always be read carefully, as they will contain tips on avoiding pitfalls that can cost you a great deal of time.

Beginning the Transcription

Version Control and Copy of Record

Maintaining Copy of Record, commonly known as "COR," is a matter of paramount importance, because failing to do so can lose you weeks, months or even years of work.

The problem arises from the possibility of having two or more people working different but initially identical copies of the same file. At this initial stage, the work of only one of the people editing the file can be saved, because everyone's work will overwrite the original file when it comes time to return the file to the drive on which the edition is stored.

Transcribing each passus as an individual file has the advantage of allowing for greater ease in giving a team parts of an edition to work on individually and simultaneously, but even this method has proven subject to corruption of COR when more than one person was issued a copy of the same passus, but for a different work process.

As a result, if you allow anyone else to work on your edition, you will need to keep careful records of who has what files, when they were "checked out," when they were returned, how you vetted them before you allowed them to overwrite the old copies, and so forth.

Making backups of the state of your work before COR transfers in more than one copy and more than one location is most wise, and it is absolutely essential to record COR transfers and work done in comments at the head of each file.

Sometimes, but not always, corrupted COR can be repaired using an application such as Beyond Compare, which has a most generous trial use policy, and which is in any case quite inexpensive and very powerful.

Note:

Develop and adhere to policies for maintaining copy of record. Never vary from your policies once you have found them to work. Such policies should include documentation within the altered files and an exchange of emails detailing the transfer. Such a token should be used in addition to email documentation, not in place of it.

It also does not hurt to include some clear-cut physical symbol or gesture or both to accompany the transfer such as a handshake and the transfer of some small token object (a poker chip with the manuscript sigil written on it, for example), since this will enhance memory of the event more than the mere receipt of email will. This is an easily-mocked but anthropologically-sound approach.

Always storing the COR of your edition on the same machine in the same directory will also increase the chances of clean COR transfers, as will always transferring COR over the same medium or media - a CD with a red label inside the jewelcase, for example.

Finally, if you do not wish to take maintenance of COR into account, do all work on paper or in separate .txt files and have one and only one person key or cut-and-paste that work into COR.

A New File for Each Passus

External Specifications:

Transcribe each passus into its own file. Because of the way the Archive's document type definition and its associated entity files have been written, you must name these files on the following pattern, where X is a hypothetical manuscript's sigil, and "passxx" is an abbreviation of "passus" and the passus number:

Xpass00.sgm, Xpass01.sgm, Xpass18.sgm, Xpass20.sgm, etc. (SGML, in old transcriptions)

Xpass00.xml, Xpass01.xml, Xpass18.xml, Xpass20.xml, etc. (XML, for new transcriptions)

Designating the prologue as passus 00 (two zeros), and putting a zero before single digit passus will result in their being sorted in order in any directories in the DOS, Windows and Unix environments, which sort in ASCII order. Otherwise, Xprol.sgm/Xprol.xml will always appear at the end of the list, Xpass2.sgm/Xpass2.xml will be sorted with Xpass20.sgm/Xpass20.xml instead of following Xpass1.sgm/Xpass1.xml immediately (which will instead be followed by Xpass11.sgm/Xpass11.xml), and so forth.

Setting up your edition by transcribing each passus into a separate file may seem a needless complication, but it has significant advantages over a single-file method, including management of Copy of Record when you want to have more than one person working on your edition at one time, and economies in use of memory during later machine processing.

Headers and Closing Tags for Each Passus

External Specifications:

Each passus, and each line group within any given passus must be opened and closed with a tagset on the following model, which represents a one-line passus, the first passus in hypothetical manuscript with sigil X:

<div1 type="passus" n="X1"> <head><hi rend="BinR"><hi rend="rb"><foreign lang="lat">Passus primus de visione</hi></foreign></hi></head> <lg type="strophe"> <l id="X1.1" n="KD1.1"><hi rend="o5"><hi rend="bl">W</hi></hi>hat þis Mountaigne bymeneþ &punctus; and þe m<expan>er</expan>ke dale</l> </lg> <trailer><foreign><hi rend="rb">Explicit hic passus paruissimvs Petri Ploghman</foreign></hi></trailer> </div1>

Note:

The extra spaces between the lines in this example are for ease in reading the code only. They should not be added as hard returns in your transcription, or in any part of your markup. The Elwood browser and several scripts used for preparing editions rely on their being no extra hard returns.

If any line-wrap appears in the example above, it is invoked by your browser, and likewise does not represent part of the intended transcription or markup.

Using NoteTab Pro from the Start

It is crucial to use a raw text editor for your transcription instead of a standard word processing program, because programs such as Microsoft Word and Word Perfect can introduce underlying code into your files that is difficult to remove later, even if you save as plain text.

NoteTab Pro is not only a clean raw text editor on the model of Wordpad or Notepad, but also has numerous features that can make your transcription much easier to do. You can for example open all the passus files at once and search them all simultaneously for a given form - and replace that form with something else, either globally or one instance at a time.

NoteTab Pro also has a clip library function - similar to the macros in Microsoft Word - that can be docked to the right or the left side of the workspace. The PPEA and SEENET have developed a custom clip library for its editors that allows automatic application of almost all tags to selected strings of text - a feature that prevents the small errors such as forgetting the forward slash in a closing tag that can lead to large numbers of parsing errors in files.

NoteTab comes in both a paid Pro version for an astonishingly modest price, and in a somewhat reduced free version - NoteTab Light. Both are available at the NoteTab web site.

Text

Individual Characters and Graphs

Thorn, Yogh, and Other Non-ASCII Characters

External Specifications:

If you have access to an SGML or XML browser, use the appropriate entity references to represent these characters. The most common special characters in the Latin and Greek alphabets appears in both SGML and XML (Unicode) format in the Entity References section of the Technical Introduction.

It is imperative to use NoteTab Clips for the SGML or XML entity references of frequently used non-ASCII letters, both upper and lower case, and of some marks of punctuation, so as to keep them perfectly regular. If you do not have access to an SGML or XML browser, however, you may find proofreading easier if you make an initial transcription using the short and completely unambiguous alternate representations in the following list:

thorn = @
capital thorn = &T;
yogh = #
capital yogh = &#;
eth = %
capital eth. = &%;

Each can later be converted to the appropriate Unicode entity reference with a global search/replace.

Note:

Under no circumstances should you use the numeral three (3) as a substitute for the entity reference for a yogh, since when it comes time to search and replace the three (3) with the proper entity reference, the numeral threes that appear in your line numbers, headers and notes will all be changed to yogh.

A Special Case

The manuscript form <3> may represent either <z> or yogh. Transcribe it as <z> when it stands for /z/ or /s/ and yogh when it represents /j/ or the velar spirant.

Allographs

Editors must decide whether it is worthwhile to record allographic forms in any given edition. For instance, the F scribe uses a long <s> as well as a sigma <s> and one other form. He has three forms of <r> which are in free variation and thus carry no information. If you decide that variant letter forms have significance and want to tag them, entity references are the most efficient way to do that; e.g. &sigmas;, &longs;, etc.

Note:

Check the charts of available characters on the Unicode web site, examine the characters available in the Junicode font, and confer with the other editors before you decide that it is necessary to make up your own entity.

Concerning the allographs <i> and <j>: Transcribe <i> as <i> and long <I> as <I>. Use <j> only at the end of series of minims, e.g. <iij> and <hij>, not <iiI> or <hiI>.

Note:

It is not necessary to use entity references for punctuation marks which are represented in the lower ASCII keyboard; all of the special punctuation entity references we presently use are listed in the Punctuation section.

Upper- and Lower-Case Letters

It is of great importance that you resolve how to handle transcription of ambiguous upper and lower case letter forms from the start.

Our policy is to follow the scribe's letterforms, but the issue is problematic because some characters have no distinctive upper and lower case forms. For instance, in many late ME hands there is no discernible upper and lower case distinction for <w>, <h>, or sigma <s>. Policy decisions with regard to capitalization can be made only after analysis of each individual manuscript.

For example, the F scribe clearly intended to emphasize the first letter in each line. After the second folio he marked each with a touch of red ink and in general, when distinctive case forms were available, he used upper case forms for the first character in each line. We have therefore used the modern typographic upper case character when, as in the case of <h>, the forms are indistinguishable. In the case of manuscript G, the scribe almost never began a line with a capital letter, so we normalize to the lower case when the letter forms are ambiguous.

Note:

Editors must decide upon a policy, document the policy explicitly, and then transcribe consistently.

Proper Nouns

The capitalization or non-capitalization of proper nouns will need to be decided on a case-by-case basis, judging by the scribal forms in each manuscript. In cases of non-dimorphic letters such as <w>, the editor should reason on the basis of the scribe's general usage. If a scribe appears to use capitalization for proper nouns in free variation with lower case forms, the editor should determine which the scribe chooses most often, and apply that form consistently to doubtful cases.

Punctuation

Use of Entity References

Use entity references for any scribal pointing not appearing in the lower ASCII keyboard, including the following:

&emdash; - — - em dash
&punctuselevatus; -  - punctus elevatus
&raisedpoint; - &#00B7; - raised point
&tilde; - ~ - tilde
/ - / - solidus or virgule
¶ - ¶ - paraph marker
&tildeamp; - &̃ - tilded ampersand representing "and" against "et" in some manuscripts

Editorial Punctuation

Punctuation introduced by the editor, such as that in the notes, should use the lower ASCII keyboard. In addition, four characters that were formerly represented by entity references in SGML will be represented by their simple lower ASCII equivalents in XML. These are:

¶ - ¶ - paraph marker
þ - þ and &Thorn; - Þ - upper and lower case thorn
&tilde; - ~ - tilde
/ - / - solidus or virgule

Note:

Additionally, the entity references ( and ) (SGML) or  and  (XML/Unicode) should be used within notes, to represent left and right parentheses, as doing so will distinguish them from the plain ASCII left and right parentheses used in the initial transcription to set off expanded suspensions. In this way, the ASCII parentheses can easily be replaced globally with opening and closing <expan> tags without also changing the parentheses in notes.

A full chart of the entity references used by the Archive is available in the Technical Matters page.

Shadow Hyphens and the TEI `<seg>` Element

Note:

One special case of editorial pointing in the text portion of the transcription (outside of notes) occurs in cases of ambiguous spacing of compounds and participles such as "a bout" or "y nempned." Here a hyphen is added, but it is tagged as a "shadow hyphen," thus:

a<seg type="shadowHyphen">-</seg>bout

For more on this sort of hyphenation, see Word Division.

The general rule for marking up punctuation is that any form that might need to be suppressed in a given stylesheet will need to be marked up with a <seg> tag. Other forms of punctuation will not be marked up.

Types of Punctuation that Are Marked Up

shadow hyphens (the hyphens between the parts of compounds such as a-bout)
swung dash and other line fillers (possibly)
any other form of punctuation that might need to be suppressed by one or more stylesheets (possibly)

Tilded and Plain Ampersand

Some scribes appear to distinguish between <&> with and without tilde, using the one with a tilde in the English text for "and/ond" and the other without in the Latin and French text for "et." We can record these with &~ if it seems useful, replacing this later with the &tildeamp; (SGML) or &̃ (XML) entity reference. Or we can simply describe the practice in the introduction.

Note:

See the special note at Editorial Correction to a Witness on how to handle a double or single tick, a paraph marker or a "cc" indicator in the margin where an intended rubricated paraph was never drawn.

Note:

We represent scribal punctuation with a space on either side of medial points, and on the left side of terminal points.

Displaying Angle Brackets as Such

Displaying angle brackets as such within notes: You will occasionally want to use angled brackets in notes. Entity references must be used so the browser will not mistake the contents of the brackets for a botched SGML or XML tag. To display <X>, for example, enter <X>.

These entity references are not as cryptic as they might seem, since "<" refers to a "less than," and ">" to a "greater than" symbol. Likewise, "(" and ")" refer to "left parenthesis" and "right parenthesis."

Un-rubrished Paraph Indicators

Note:

In some manuscripts, paraphs are sometimes indicated by one or more indicators such as a full paraph marker, single or double ticks, or a full paraph marker, in plain text ink, which were then missed by the rubrisher.

In such cases, simply transcribe a ¶ (SGML) or & (XML) without adding color tags such as <hi rend="rb"></hi>.

Roman Numerals

External Specifications:

Roman numerals in the text or titles are tagged to appear as numerals in the diplomatic text and as words in the critical text. Use the <orig> (=original) and <reg> (=regularized) tags in tandem as follows:

<orig>xij</orig><reg>twelf</reg>

<orig>xij</orig><reg>duodecimus</reg>

Roman numerals in formework or marginalia require no <reg> tags.

Ambiguous or Illegible Characters

External Specifications:

Use the <unclear> tag to indicate where characters are unclear or ambiguous. If they are unclear due to a scribe's attempt at deletion, usually by erasure or overwriting, <unclear> tags should be nested within <del> tags. (See Deletions, especially Example 5.) If the characters cannot be discerned at all, use <damage> or <supplied> tags instead.

Standard attributes of <unclear> are as follows:

"reason" indicates why the material is hard to transcribe. Standard attribute values are:
- ill-formed
- torn
- faded
- rubbed
- smeared
- overbound
- stained
"resp" indicates the editor responsible for the transcription of the unclear text. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
"cert" signifies the degree of certainty ascribed to the unclear text.
"hand" indicates the hand responsible for action that created the difficulty in transcription, where determinable. See the Identifying Hands section.
"agent" signifies the causative agent for the difficulty. Standard values include:

water
mildew

Sample tagging:

<unclear reason="ill-formed" hand="hand1">unclear material</unclear>

<unclear reason="faded" cert="60%">unclear material</unclear>

Spaces and Gaps

External Specifications:

<space/>

Note:

For spaces between words, see Word Division.

Use the <space> (<space/> = XML) tag to indicate where space is left vacant for characters (most often seen where an intended ornamental capital was never made). If the space is due to an erasure, see the Deletions section. Standard attributes for <space> are as follows:

"dim" indicates whether the space is horizontal or vertical.
"resp" indicates the editor who identified and measured the space. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
"extent" indicates the size of the gap. We have thus far used the imprecise unit of the space required for a character in the scribal hand, although you may describe the area affected in inches, millimeters, folios, or whatever makes sense.

Sample tagging:

<space dim="horizontal" extent="6"> (SGML)

<space dim="vertical" extent="2 lines"/> (XML)

Note that this tag has no content and therefore need not be closed in SGML, but will need the closing forward slash in XML (<space/>).

Stray Marks, Blots, Stains, and Flourishes

External Specifications:

<seg>

Most flourishes deemed by the editor not to be textually significant are neither transcribed nor tagged, though they should always be mentioned in the description of the paleographic features of the hand in the Front matter, and may be described in a paleographical note in the transcription. Unintentional marks, ink trails, blots, smears, bleed-through, or offset ink are neither noted nor transcribed in any way unless they are textually significant.

In F, for example, we recorded in provisional notes a number of apparently otiose curls written erratically over various letters throughout the manuscript, for though we could not determine what they meant, they appear to have been intended. We did not, however, tag them.

Note:
Provisional notes or tagging can be developed and applied to marks that may or may not have a significance that will emerge only upon a broader inspection of the text in relation to them. Such provisional notes or tagging must be rigorous and absolutely regular, in order to make it possible to strip out the markup in the event that the marks are indeed insignificant.

A number of recent projects have presented markings which although of ambiguous significance, proved to be worthy of markup, in some cases experimentally, in others with standard TEI/PPEA markup. In each instance, a different approach was taken as our thinking on this topic has developed gradually. Such instances include the small bars added in the margins of L (as in F, these were recorded in the introduction and in notes, but not marked up), the large line filler dashes in M (marked up with a standard TEI <seg> tag), and several types of markings in Ht (tagged experimentally as a means of analysis). Earlier, in the edition of manuscript L, prominent marginal bars, clearly deliberate but of no clear significance, were recorded only in a table in the introduction. A careful consideration should always be made before deciding to mark up something not usually tagged under the protocols.

Words

Word Division

External Specifications:

Overview

Medieval word division is not steadily consistent with modern usage, and scribal inconsistency of spacing further complicates the matter. Handwritten documents do not present the uniform spacing that modern readers of print have come to expect. In order both to facilitate machine collation and to assist users who will search for specific words, we have decided to resolve the problems of word division by reference to probable scribal meaning. Even in the diplomatic transcriptions,we will not attempt to represent scribal spacing unmediated by reference to meaning.

We follow the word-division of the manuscript as far as practicable, though no attempt is made to represent the variety of spacing between words and letters. The interpretation of the scribe's word-division, though it is generally unambiguous, is occasionally a matter of fine judgment. There is sometimes no obvious space between the indefinite article and the following noun, e.g. afreman (M20.145), but there seems no good purpose in recording these as a single word. A hyphen in the transcription (indicated with <seg type="shadowHyphen">-</seg>) indicates a space in the manuscript within a word, or a compound or phrase conventionally hyphenated today. In doubtful cases we have followed OED and MED to distinguish compounds from phrases. Conversely, some phrases, in particular or elles, at ese and at ones, are written as one word, orelles, atese, atones. Phrases like those are marked with <orig><reg> tags; e.g., <orig>orelles</orig><reg>or elles</reg>

Although we will not attempt to represent every nuance of scribal spacing in the transcript, we will include detailed information in the linguistic descriptions which will accompany each text. It is, therefore, important for each transcriber to collect data during the initial transcription. Provisional tagging and notes will call attention to problematic instances or spacing which requires the transcriber's interpretation. The <orig> / <reg> and <note> tags will be useful at this stage, and may be removed or retained when final decisions about the manuscript are made.

Examples:

We decided not to represent the scribal "tothe" produced (once) by the scribe of MS F but silently regularized it to "to the." Although not tagged in any way in the transcription, this interpretation is noted in the linguistic information.

Markup

Note:

Since it is necessary to distinguish between the hyphens used by a scribe from those introduced by the editor as a convenience for identifying compounds, each instance of editorial hyphens in the transcription should be tagged as follows:

a<seg type="shadowHyphen">-</seg>bout

Compounds

In general, we will attempt to represent -emic significance. Compounds, therefore, present a special case. Historically in a state of transition, they may not always concur with modern usage. In addition, within the same hand, a given "word" may appear variously as a single word or two. In such cases, we will indicate scribal spacing as accurately as possible by using a hyphen to indicate the spaces that separate the morphs. For example, we will follow the scribe in transcribing either "beleve" or "be<seg type="shadowHyphen">-</seg>leve."

Close Calls

Close calls: When there is small space, smaller than that between words and larger than that between letters after one- and two-letter prefixes such as bi-, by-, to-, or a- (apeyre, among, aboue, etc), the initial transcriber will decide whether to use the shadow hyphen (<seg type="shadowHyphen">-</seg>) or to represent one word. Provisional notes can provide important data about scribal patterns. See the Editorial Notes section.

Note:

Line fillers should not be transcribed in these or any other instances.

Participles

Grammatical considerations: The graphs "y" and "i" as markers of the past participle are morphemic. When they appear to be separated from the verb stem by a space, we will represent that space by a shadow hyphen (<seg type="shadowHyphen">-</seg>), thus y<seg type="shadowHyphen">-</seg>wro3t, I<seg type="shadowHyphen">-</seg>blessed, y<seg type="shadowHyphen">-</seg>nempned, etc.

Note:

Be careful not to represent instances of the first person singular pronoun + preterite verb with a hyphen.

Allegorical Names

Allegorical Names: In the documentary texts, shadow hyphens will indicate instances where a scribe has separated the morphs of the central allegorical names Do<seg type="shadowHyphen">-</seg>wel, Do<seg type="shadowHyphen">-</seg>bet, and Do<seg type="shadowHyphen">-</seg>best. Regardless of spacing, other allegorical names will not be hyphenated. This policy applies only to the documentary editions, not to later critical editions.

Abbreviations

External Specifications:

<expan>

Use the <expan> tags to indicate the resolution of standard abbreviations; e.g. p<expan>ro</expan>p<expan>ter</expan>. Editors without an SGML or XML browser may find that proofreading is made easier if the expanded material is put between parentheses initially; e.g. p(ro)p(ter). The essential rule here is that one records the interpretation in tags or parentheses, leaving unambiguous graphs outside them. Eventually parentheses will be replaced with <expan> and </expan>. Highly unusual abbreviations can be placed between tags or parentheses, but each should be followed by a paleographic note. See the Editorial Notes section.

Note:

If you use lower ASCII parentheses to indicate <expan> elements that will be globally substituted later, you must indicate any parentheses that are to remain as such with the entities ( (left parenthesis) and ) (right parenthesis), in order to keep them from being replaced with <expan> tags.

The various forms of <&c.> and <&> will be represented by the entity reference &. We will indicate scribal spacing, either joining or separating the <&>and <c>. We will also indicate a dot if it appears. The <c.>should be expanded. Some possible combinations are as follows:

& c<expan>etera</expan>
&c<expan>etera</expan>
&c<expan>etera</expan>.
&c<expan>etera</expan>.

Some scribes make a distinction between the ampersand indicating the English "and" from one indicating the Latin "et" by the introduction of a tilde over the ampersand for the English word. This may be indicated with its own entity reference:

&tildeamp; (SGML)
&̃ (XML/Unicode)

Note:

The spelling of expansions will have to be regularized based on the spellings of words that are written out. A frequent case in which variation occurs is in the plural, where the scribe's dialect may motivate -es, -is or even -us. The regularization should represent the majority form.

Word Brevigraphs

External Specifications:

<expan>

SGML and XML allow us to identify the brevigraph we are expanding in the <expan> tag with the attribute abbr. Examples of some of the most common are:

<expan abbr="Ihu&tilde;">Iesu</expan>
<expan abbr="xpi&tilde">christi</expan>
<expan abbr="Ihs&tilde;">Iesus</expan>
<expan abbr="xpo&tilde;">christo</expan>

If you choose not to identify the brevigraph, use parentheses or <expan> without abbr as in the Abbreviations section; e.g. (Iesu) would eventually become <expan>Iesu</expan>.

Note:

The spelling of word brevigraphs will have to be regularized based on the spellings of words that are written out. A frequent case in which variation occurs is in the word "Christ," which may also be spelled "Crist." The regularization should represent the majority form.

Language Shifts

External Specifications:

Use <foreign> tags to mark Latin/French/German text:

<foreign lang="lat">nota</foreign>
<foreign lang="fre">plus chaud</foreign>
<foreign lang="ger">Schriftsprache</foreign>

Note:

Various means of highlighting text (by changes of script or ink or by underlining, etc.) using the <hi> tag are often associated with changes in the language. The <hi> tags we use to indicate such highlighting may be nested within the <foreign> tags, or vice versa, though arbitrary nesting of these tags is not without consequences for later processing and display.

The following example shows tagging for "Anima" written in textura, in red ink, in a red box:

<hi rend="BinR"><foreign lang="lat"><hi rend="tx"><hi rend="rb">Anima</hi></hi></foreign></hi>

The order of tagging in this example - first boxing, then foreign language, then other <hi> - is preferable because of the way in which all or most display technologies such as CSS, XSL, PERL and any other language that relies on the matching of regular patterns will locate and style or process such features. A regular - and thus an expected and predictable - order of nesting will greatly facilitate later display and analysis of your edition.

Occasionally the foreign text and the highlighting are not conterminous, and this introduces a common complication regarding tag nesting. The example below shows <hi> tags nesting within <foreign> tags, where the phrase "in Infernum" appears with the Latin "in" outside the red box enclosing the rubricated, textura "Infernum."

<foreign lang="lat">in <hi rend="BinR"><hi rend="tx"><hi rend="rb">Infernum</hi></hi></hi></foreign>

If <foreign> tags are nested within <hi> tags, as in the first of the following examples (where one English word and one Latin word are in a red box), the result will parse perfectly, since the DTD allows for such nesting, but in the cases of boxing, underlining, or any other highlighting that forms a continuous line across white space, the second order of nesting is necessary in order to make the styling appear to be continuous, the way it most likely would in a manuscript:

<hi rend="BinR"><foreign lang="lat">Satisfaccio</foreign>dobest</hi> produces

(Red box)Satisfaccio(Red box) (Red box)dobest(Red box)

<foreign lang="lat"><hi rend="BinR">Satisfaccio</hi></foreign><hi rend="BinR"> dobest</hi> produces

(Red box)Satisfaccio dobest(Red box)

Note:

Be aware that <foreign> and <hi> tags do not carry over past </l> tags, so each Latin line will need to be tagged separately, though breaking a line solely with <lb> tags - i.e. when line numbering several manuscript lines as a single Latin line in relation to Kane-Donaldson - does not require the use of additional <foreign> tags. For an in-depth explanation of the use of <lb> tags in conjunction with <foreign> tags, as well as sample code, see the section on line numbering pitfalls in the Line Breaks section that follows.

Unique Readings

External Specifications:

The TEI <app> element is used encode unique readings. In the case of unique readings, the "wit" attribute of the <lem> tag contains only the sigil of the manuscript in question. The "wit" attribute of the <rdg> tag may contain only the following standard values:

all other mss
most mss

When the unique reading is clearly the result of a scribal error, <sic> tags nest inside of the <lem> tag to indicate that the variant is unintentional. Recording a scribe's unintentional errors is a means of determining their type and frequency, leading to a clearer picture of the scribe and his habits.

A complete TEI <app> tag array is as follows:

<app><lem wit="Dx">kynge</lem><rdg wit="all other mss">knyght</rdg></app>

Patterns such as frequent omission of unstressed monosyllables, miscounting of minims, or transposition of words or letters may aid in distinguishing between two scribe's of similar handwriting, or they may corroborate the identification of additional samples of the same hand in other manuscripts.

A textual note may also accompany this tagging, explaining the significance of the reading.

Layout

Line Breaks

External Specifications:

Each line in each manuscript in the Archive will be assigned its own number, which will become the "id" attribute value for that line, and must therefore be globally unique. Hence, even if the line is repeated elsewhere verbatim, each of these instances will receive its own unique number. An additional set of attribute values, those in the "n" or "name" attribute of the <l> element will correlate the line numbers to those in the Athlone editions (initially), and eventually to those in the archetype and the critical text. The "n" attribute may therefore repeat at times, if the same line has been copied into a manuscript more than once.

<l ID="Q1.1" n="KDP.400">First instance of line in MsQ corresponding to hypothetical KDP.400</l>

<l ID="Q1.2" n="KDP.400">Second instance of line in MsQ corresponding to hypothetical KDP.400</l>

Note:

A concordance of parallel lines in the A/Ax, B/Bx and C/Cx texts is under development by the Archive, and is currently in the alpha test stage, to be published as a reference by SEENET after its beta test.

As in the example above, the format for a tag at the beginning of a line always has both an "id" and an "n" attribute, the "id" corresponding to the absolute numerical position of the line in the manuscript, and the "n" (name) corresponding to the line number of the parallel line in the relevant Athlone edition:

<l id="F1.3" n="KDP.4">

Our line number (the "id" of <l>) in this example is F1.3 which corresponds to KD Prologue line 4 (recorded in the "n" attribute).

Note:

F has skipped a line, causing its line numbers (like its passus numbers) to be out of synch with Kane-Donaldson. The line number and passus number may frequently be different from that of the parallel line in an Athlone edition.

Each line is ended with the tag </l>. We have in Charlottesville a program for inserting both the line numbers and the line terminal tag.

Since the introduction and placement of line break tags is predicated on the assignment of line numbers and editorial decisions as to what constitutes a line in a given manuscript, further discussion appears under the head Line Numbering in the next section.

Line Numbering Special Issues

External Specifications:

Assigning line numbers is fraught with potential pitfalls. The first is in determining what constitutes one line. In Latin passages, it is not always clear whether the scribes intended to write prose, or even verse, as separate lines or as run-ons; i.e.,a physical line break may represent the medieval equivalent of either a "soft return" or "hard return." The scribe's use of boxes, upper or lowercase letters, terminal punctuation and grammatical structures may provide clues to his conception of line divisions in long Latin quotations. Each intended line will been closed in <l> tags as shown above, without regard to its physical arrangement.

Where the physical line breaks do not correspond to the scribe's perception of a new line, we will insert a TEI line break element, <lb> (SGML) or <lb/> (XML). Consider this example from MS L:

<l id="L13.47" n="KD13.45&agr;"> Vos qui peccata hominum comeditis nisi pro eis lacrimas & oraciones <lb/> effunderitis . ea que in delicijs comeditis . in tormentis euometis</l>

The sentence structure seems to dictate that the line would not end at "oraciones." That the scribe would agree is demonstrated by the indentation of "effunderitis" and his decidedly lower case <e>. (The above example has been stripped of all tags but the ones under discussion. In fact, a <foreign> tag is opened before "Vos" and closed after "euometis.")

The line break element, <lb> or <lb/>, should be used to represent a line break within all <marginalia>, <fw>, <add> and <l> tags as well as within notes citing more than one line of text. Since <lb> / <lb/> is an empty element, it does not need a closing tag in its SGML form, though it does need the special form with the forward slash in XML. As a default, it will cause a line break to occur at the point at which it is inserted, under any stylesheet we may develop. In special cases, however, such as in notes, an <lb/> may be given an "n" attribute value that can be used to distinguish it from from other linebreaks, making it possible to suppress or add a line break, or insert a pipe character as needed.

As soon as the transcription has been finished and properly prepared, the PERL scripts for line numbering and tagging should be run.

Paragraph and Strophic Breaks

External Specifications:

The tag <lg type="strophe"> is to be inserted at the beginning of a strophe and </lg> at the end.

In some manuscripts, strophes are marked with paraphs or skipped lines or both. Record these, with <lb> for skipped lines and ¶ for paraph markers. In most manuscripts, these paraphs are in red, green, and blue. Where the editor has access to the manuscript or a color facsimile, the colors should be recorded in <hi> tags; e. g. <hi rend="bl">¶</hi>.

Note:

See the special note at Editorial Correction to a Witness on how to handle a single or double paraph tick, a parasign, or a "cc" paraph indicator in the margin where the paraph was never drawn or rubrished.

Passus Breaks

External Specifications:

The tag <div1 type="passus" n="Xpass[number]">--where "X" is the sigil of the manuscript and "[number]" is the number of the passus--should precede the transcription of each passus. The final item, the content of the "n" attribute, will of course change with each passus. The closing tag </div1> is inserted at the end of the passus after any trailer, if there is one. <div1></div1> is always the outermost container of each passus file.

Where non-standard passus divisions occur, indicate where passus divisions appear in the archetype with this tag: <milestone unit="Bpassus" n="[number]"> (SGML) or <milestone unit="Bpassus" n="[number]"/> (XML). The milestone tag is always empty, so it does not need to be closed in SGML, but in an XML document, it requires the special forward-slash format: <milestone/>.

Note:

The line numbering program ignores <milestone> tags, so it is safe to insert them before sending the file for line assignment.

Foliation

External Specifications:

Between the bottom of each leaf and the top of the next, supply a tag such as this:

<milestone unit="fol." n="36v" entity="M036v"> (SGML)

<milestone unit="fol." n="36v" entity="M036v/"> (XML)

The transcription of folio 36v follows the tag. Since the <milestone> or <milestone/> tag is always empty, marking only the beginning point of the folio boundary, it need not be closed in SGML, but requires the forward slash format in XML.

Note:

The line numbering program ignores <milestone> tags, so it is safe to insert them before sending the file for line assignment.

The entity attribute value of the <milestone> / <milestone/> above refers to the hyperlinked image for folio 36v of manuscript M. Make certain that your images are named on the regular pattern sigil-folio-side, since this will make it easier to set up a regular and accurate pattern of entity naming for the images and their links in the edition.

Forme Work

External Specifications:

<fw>

The <fw> element identifies material added by the scribes or printers to indicate codicological structure, such as headings, top-of-page titles, catchwords, corrector's marks, guide words for the rubricator in the margin, etc. Attributes of <fw> include:

An "id" attribute is always added as a unique identifier for each instance of formework. This id will be assigned by a Perl script after the other editorial work is completed.

For "type" use only the following categories:

running head
page (for the page number)
fol (for the folio number)
sig (=signature)
qSig (=quire signature)
lSig (=leaf signature)
catch (=catchword)
cor (where the corrector "signs off" on a gathering)
guideWords (where scribe has written instructions for the rubricator)
guideLetters (where the scribe has inserted a guide for the ornamented capital)

For "place" use only the following categories:

inline
supralinear
sublinear
marginLeft
marginRight
topLeft
topCenter
topRight
bottomLeft
bottomCenter
bottomRight

Note that the categories are written as one word, camel cased in all of these sample forme work tags except for "running head":

<fw type="catch" place="bottomRight">And then</fw>
<fw type="sig" place="bottomCenter">g iij</fw>
<fw type="cor" place="bottomLeft">coret</fw>
<fw type="running head" place="topCenter">Piers Plowman</fw>
<fw type="guideWords" place="marginRight"><foreign lang="lat">"Passus primus de visione</foreign></fw>

We do not include modern foliation in our transcription but characterize it in the description of the manuscript in the introduction.

Highlighting and Appearance

The <hi> element is used to describe the various ways scribes might call attention to text, such as by changes in script, size, or color, by underlining or boxing, etc. The only attribute we will need is "rend."

Be aware that <hi> tags, like <foreign> tags, do not carry over past </l> tags, so each line will need to be tagged separately. (See the note on <hi> tag nesting for examples of how <hi> and <foreign> tags may be nested.) If the highlighted text was added after initial copying, the <hi> tags should be nested within <add> tags.

Ornamental Capitals

External Specifications:

<hi>

<hi rend="o8">N</hi>Ow

This tag marks an ornamental capital "N" of 8 lines height followed by a capital "O" and lower case "w". Note that the <o> is the letter <o>, not the digit zero. We do not specify width.

Changes of Script

External Specifications:

In the following example, "Danyel" is written in textura:

<hi rend="tx">Danyel</hi>

We are interpreting shifts in type of script as being for the purpose of emphasis or highlighting. The TEI actually has a <handShift> (SGML) / <handShift/> (XML) tag, but as an empty tag, it is less suitable to our use than one might imagine. <emph> tags can be used instead of <hi> tags. I chose the latter because it is more non-committal (HND).

Note:

Standard reference works on scripts include:

Michelle P. Brown, A Guide to Western Historical Scripts from Antiquity to 1600, London: The British Library, 1990
M. B. Parkes, English Cursive Book Hands, 1250-1500, Oxford: Clarendon, 1969
Jean F. Preston and Laetitia Yeandle, English Handwriting, 1400-1650: An Introductory Manual, Binghamton, N.Y.: Medieval & Renaissance Texts & Studies, 1992

Rubricated and Other Color-Highlighted Words and Phrases, and Otherwise Highlighted Text

External Specifications:

<hi>

<hi rend="rb">Dowel</hi>
<hi rend="tr">Dowel</hi>
<hi rend="tr">D</hi>owel
<hi rend="bl">D</hi>owel
<hi rend="gr">D</hi>owel

In the first example, "Dowel" is rubricated. In the second, the black letters are touched with red ink. In the third example, the "D" alone is touched in red. In the last two examples the initial <D> is written in blue and green ink, respectively.

Underlined Words

External Specifications:

<hi>

The tagging in the following example indicates that "Glotoun" is underlined in text ink (or the color is unknown to the transcriber):

<hi rend="ul">Glotoun</hi>

If a scribe clearly intends to underline a word, tag the whole word even if the line begins after the first letter or ends before the last.

Boxed Words and Phrases

External Specifications:

<hi>

Examples of <hi> tagging for underlining are:

<hi rend="boxed">Repentaunce</hi>

<hi rend="BinR">Repentaunce</hi>

Common abbreviations for contents of rend attributes

lc = Lombard Cap
o[number] = ornamented capital, N lines high
bigger[number] = taller than usual letter, N lines high
br = brown ink
gr = green ink
bl = blue ink
rb = rubricated
tr = touched in red
tg= touched in green
tx = textura
ul = underlined with color unspecified or text ink
ur = underlined in red
ulANDol = underlined and overlined with color unspecified or text ink
ulrANDolr = underlined and overlined in red
boxed = boxed with color unspecified or text ink
BinR = boxed in red

Add example here of boxing versus flourished underline, including picture.

Three Special Instances of `<hi>` Tag Use

External Specifications:

<hi>

Note:

There are three values for the "rend" attribute of <hi> that we will use in notes only. They are: "bold" (For the A, B and C of A-Text, B-Text and C-Text), "sup" (for superscript characters, usually in manuscript sigils) and "it" (italic, for quotation from the transcription). Do not use these values in the transcription itself.

Example: "Other <hi rend="bold">B</hi> manuscripts read . . . ."

For a complete discussion of the handling of notes within the transcription, please see the section on Editorial Notes.

Scribal Changes to a Manuscript

Damage

External Specifications:

In general, we record only damage made after the manuscript was first written. Those defects already in or on the vellum or paper and written around are not textually significant. We record damage only if it makes the text unclear or illegible.

If the damaged text cannot be transcribed with certainty, use <unclear> tags.

In a s<unclear agent="water">omer se</unclear>soun

If it is completely illegible (cropped, for example), use <supplied> tags to record the damage and supply the missing text, though only if you wish to supply such text.

The <unclear> or <supplied> elements may be used instead of or in addition to the <damage> tag. Possible attributes and their values are as follows:

"type" describes the damage. Standard attribute values are:

torn
cropped
faded
rubbed
smeared
stained
overbound
creased

"agent" signifies the cause of the damage. Standard attribute values are:

water
mildew

"extent" indicates the size of the damaged area. We have thus far used the imprecise unit of the space required for a character in the scribal hand, although you may describe the area affected in inches, millimeters, folios, or whatever makes sense.
"resp" refers to the transcriber who makes the decision about the existence, type, and extent of the damage. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
"hand" indicates the scribal hand responsible for the damage, where determinable. So far, we have had no occasion to use this attribute, but sample values would follow the hand designations declared in the TEI header, as follows:

hand1
hand2
handx

Examples:

sapien<damage type="cropped"><supplied source="other B manuscripts">ter</supplied></damage>

<damage type="stained">wandrynge</damage>

Additions

External Specifications:

Note:

If you have a transcription with markup finished before the publication of manuscripts L and O (November 2004), the marginalia are most likely recorded in <add> tags within notes, and will have to be moved into <marginalia> tags.

Six Elements Mark-Up Additions

There are six elements that we use to tag additions, <add>, <addSpan>, <fw>, <head>, <trailer> and <marginalia>.

`<add>` and `<addSpan>`

The <add> tag serves to mark up phrase level text and <addSpan> to mark up larger blocks.

Note:

Effective as of July 2003

Use <add> tags only for words and phrases introduced into the text after the initial copying (whether by the original scribe, contemporary or later scribes).

Use <marginalia> for all marginalia copied into the manuscript at the initial time of its production.

Do not in any case use <add> for material you have added to the text. (See the Editorial Intervention section.)

Use add tags for textual matter added to the text after the initial transcription. Forme work and marginalia tags can also be marked when necessary with <add> tags, though in many cases it will be impossible to identify the hand responsible for the addition to the text.

The attribute "place" designates the point at which the addition is made. Use only the following values:

inline
supralinear
sublinear
marginLeft
marginRight
topLeft
topCenter
topRight
bottomLeft
bottomCenter
bottomRight

Note that the designations are written as one word, with camel-casing, exactly as they appear. This is of importance for later processing and display. Other attributes of the <add> element are:

"hand" identifies the scribe who made the addition. See the Identifying Hands section.
"resp" identifies the editor or transcriber who identified the hand. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
"cert" signifies the transcriber's degree of certainty as to the identification of the hand.

The following examples illustrate tagging for words of the poem omitted during initial copying but subsequently supplied, the first above the line by the original scribe, the second careted and written in the right margin by an unidentified hand:

<add place="supralinear" hand="hand1">for</add>

<add place="marginRight" hand="handx" >Dowell</add>

You may wish to reiterate or supplement the information in <add> tags with a note, as in the sample line below:

Right so <add place="supralinear"hand="hand1">bi</add><note type="textual"> All other manuscripts omit <hi rend="it">bi</hi>, added above the line in W.</note> persons and preestes

See the Editorial Notes section for further discussion on the tagging and nesting of notes.

Marginalia Cited in Legacy Notes

Note:

Because the <add> tag used to be use to mark up marginalia, recorded inside codicological notes, legacy markup of this kind will have to be moved into <marginalia> tags, which are documented in the Marginalia section.

If the added material is not meant to be part of the text, put it into <marginalia> or <fw> tags as appropriate. You will in many cases wish to attach an explanatory note as in the following instance where "Stretford" is written in the left margin by a later hand.

<marginalia id="XP.14m1" place="marginLeft" hand="hand3">Stretford<note id="XP.14m1n1" type="codicological"><ref>XP.14:</ref> A sixteenth-century hand has added <hi rend="it">Stretford</hi> in the left margin.</note></marginalia> <l id="XP14" n="etc.

Since hand3 is already identified in the header and introduction as a sixteenth-century hand and the information about place and hand appears in the display if asked for, probably the discursive note is unnecessary.

`<addSpan>`

The <add> tag is used for short sequences of text, single words, or phrases. <addSpan> must be used for larger level additions because <add> tags do not carry over past structural boundaries like </l>. (See the Line Breaks section.) <addSpan> has the same attributes as <add>, with the addition of the attribute "to," which refers to the spot where the added material ends. (There is also the possibility of using the attribute "type," but that would be used only if the added text is not on an original manuscript page.) Instead of the expected closing tag, <addSpan> tags are closed by an <anchor> tag placed at the end of the span of added text. If, for example, two lines were omitted in the body of the text and added by the original scribe in the bottom margin, the tags might appear as follows:

<l id="L5.257" n="KD5.252"><addSpan place="bottomCenter" hand="hand1" TO=addend01> And haue ymade many a knyƷte . bothe mercere & draper<expan>e</expan></l>

<l id="L5.258" n="KD5.253"> þat payed neuere for his prentishode . nouƷte a peire gloues <anchor id=addend01></l>

Note:

The value in the "to" attribute and <anchor> id may not be a line number or any other element present elsewhere. We will use "addend" + a number. Also, this value is not in quotation marks like all others we use. Finally, the <addSpan> and <anchor> tags must each be within <l> tags, as in the above example.

`<head>` and `<trailer>`

Headers and trailers such as the passus headers and explicits are marked up with <head> and <trailer> tags, which are documented in the Headers and Closing Tags for Each Passus section.

`<fw>`

Running titles, guide words, catchwords and signatures and any other forme work are marked up with <fw> tags, documented in the Forme Work section.

`<marginalia>`

Marginalia are marked up the with the <marginalia> tag, documented in the Marginalia section.

Deletions

External Specifications:

As with <add> and <addSpan> above, we use two elements to tag deletions, <del> and <delSpan>. <del> serves to mark up phrase level text and <delSpan> to mark up larger blocks.

Use <del></del> tags where a word or passage is deleted or marked for deletion by a scribe, annotator, or corrector. The content of these tags may be either the characters that were deleted, if they are legible either under white or ultraviolet light, or symbolic if they are not legible. Symbolic representations of deleted characters should be supplied as folows: one period (.) per character up to five characters when it is possible to determine or guess the number of characters deleted, ...?... for deletions of six to a dozen characters, and ...?...?... for deletions of one half-line or more. In some cases, readers should be told in a paleographical note to consult the manuscript images, and should be given a link to the relevant image.

If the deleted text is unclear, <unclear> tags may be nested within <del> tags (as shown in Example 4 below). <unclear> tags give the option of expressing your degree of confidence in the reading. If the deleted text can be easily read, or, at the opposite extreme, cannot be read at all, there is no need to insert <unclear> tags.

Note:

The <del> tag will be followed by <add> tags (previous section) where a scribe has deleted and then substituted text. (See Examples 2-5 below.) This ordering is not only logical, but also has consequences for later processing and display.

Standard attributes and attribute values for the <del> tag are as follows:

"rend" indicates how the deletion was made in the text. Standard attribute values are: The "intended" attribute value, which indicates a deletion that was intended but not realized, must always be accompanied by a note. (See example #5.)

subpunction
erasure
overwritten
linedThrough
bracketed
ul

"type" is synonymous with rend. We have chosen to use rend.
"resp" indicates the editor responsible for identifying the hand of the deletion. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
"hand" identifies the scribe responsible for the deletion. See the Identifying Hands section.
"cert" indicates the degree of certainty in attributing the deletion to a hand.

Example 1)The word "Plowman" struck through with nothing substituted: <del rend="lined through" hand="hand1">Plowman</del>

Example 2) The character "t" subpuncted in the word "clept" and replaced in the right margin with "l": c<del rend="subpunction"hand="hand2">t</del><add place="marginRight" hand="hand2">l</add>ept

Example 3) The letter "e" erased and replaced with "y": <del rend="erasure">e</del><add place="inline">y</add>

Example 4) Some letter, probably an "o," overwritten with "n": <del rend="overwritten" hand="hand1"> <unclear cert="80%">o</unclear></del><add place="inline" hand="hand1">n</add>

Example 5) The original scribe wrote "for egre" where all other manuscripts have "ful egre." A partially scraped corrector's mark in the margin indicates that the error was noticed. A scribe (not the text scribe) has added correct "ful" in the right margin, without deleting "for." <del hand="hand3" rend="intended">for</del><add place="marginRight" hand="hand3">ful</add><note resp="hnd" type="codicological">A partially scraped corrector's cross in the left margin indicates that the correction was intended but not carried out.</note>

The <del> tag is used for short sequences of text, single words, or phrases. <delSpan> must be used for larger level deletions because <del> tags do not carry over past structural boundaries like </l>. <delSpan> has the same attributes as <del>, with the addition of the "to" attribute, which refers to the spot where the deletion ends. Because <delSpan> and <addSpan> are similar, the example shown in the previous section for <addSpan> should be helpful.

Editorial Intervetion

Editorial Alterations to the Text

Text Supplied

External Specifications:

<supplied>

You may use <supplied> tags where text is missing or completely illegible and you can supply it by reference to another source or by conjecture.

Standard attributes for <supplied> are as follows:

"reason" indicates why the material had to be supplied. Standard values for this attribute are:

torn
faded
rubbed
smeared
overbound
stained
patched

"resp" indicates the editor responsible for supplying the letter, word, or passage contained within the <supplied>element. The default will be the initials of the named editor(s) as they appear in the TEI header, and need not be recorded.
"source" states the source of the supplied text (the editor's initials in the case of conjecture).
"hand" indicates the scribal hand responsible for the damage that obliterated the text, where determinable. So far, we have had no occasion for this attribute.
"agent" signifies the causative agent for the loss of text, where determinable. Standard values are:

mildew
stained
water

Note:

Note the crucial distinction between the easily confused "hand" and "agent" attributes.

Sample <supplied> tags:

<supplied reason="cropped "source="other B manuscripts">Wh</supplied>er

Not<supplied reason="overbound" source="hnd">a</supplied>

Editorial Correction to a Witness

External Specifications:

TEI-conformant SGML and XML permit several different ways of marking corrections to the base manuscript by an editor. We have chosen to use the <sic><corr> tags in tandem to show the manuscript reading and our emendation, respectively, because this will provide the widest range of display options. Note the use of square brackets in the following examples:

<sic>for</sic><corr>for [hem]</corr>

<sic>kaue</sic><corr>k[n]aue</corr>

Note:

In instances where a single or double tick, a paraph marker, or a "cc" indicator was written in the margin to indicate to the rubricator where the parasign should go, but it was never drawn, simply record a parasign--¶--without adding <hi rend="rb"> (rubric) tagging.

The <corr> tag may include the "resp" and "cert" attributes if this more complicated markup seems necessary or useful:

<sic>kaue</sic><corr resp="hnd" cert="100%">k[n]aue</corr>

Editor Refrains from Correcting a Witness He Thinks Is Mistaken

External Specifications:

A <sic> tag may be used without <corr> if the editor elects not to correct the text. <sic>seten to seten to</sic>

A more complicated tag can be used if the editor wishes not to display a correction but does want to record his opinion.

<sic resp="hnd" corr="my[s]chief"cert="99%">mychief</sic>

In this example, "cert" indicates the degree of certainty ascribed by a spineless HND to what he believes is the correct reading, but the displayed text will contain the erroneous reading "mychief." Note that a style sheet can be crafted that will display either the tag or the attribute, so the emendation is in fact made here but not displayed.

Editorial Notes

External Specifications:

Editorial Notes within `<l>` Tags

The Medieval Academy of America and the Chicago Style Manual

Note:

Bibliographies and notes must conform to the style manual of our publisher, the Medieval Academy of America. In doubtful cases, the Medieval Academy refers authors to the Chicago Manual of Style, 15th ed. (2003).

Note:

Protocols regarding when to make a note, when to record in markup only, and when to have both are forthcoming.

IDs and `<ref>` Tags for Notes within `<l>` Tags

We will use <note> tags with nested <ref> tags indicating line number, as follows:

<note type="textual"><ref>M20.65:</ref> Content of note.</note>

All notes must also be assigned an ID number, the value of which is based on the line number in which the note appears. Thus the note above, if it were the first note in the line, would receive the following ID:

<note id="M20.65n1" type="textual"><ref>M20.65:</ref> Content of note.</note>

Subsequent notes on this line would be numbered "M20.65n2," "M20.65n3," and so forth.

Note:

Since these ID values are best assigned by a script, they can be added to an edition at the final stage of work, rather than by hand as the editor works.

So far we have designated the following note types: codicological, paleographic, linguistic, lexical, historical, source, theological, and textual.

Order of Sigils Listed in Textual Notes

Sigils should be listed in textual notes with the base text sigil first, followed by the beta and alpha sigils in the following order:

WHmCrGYOC2CBLMRF becomes WLMCrHmCGOC2YBRF, where "W" would be the base manuscript.

This sigil order is based on the stemma constructed by Robert Adams.

We will follow the convention of displaying the Piers Plowman A, B, and C designations in bold type: <hi rend="bold">B</hi>. Other useful values for <hi rend> in notes are "sup" (=superscript) and "it" (=italic).

<note type="textual"><ref>W3.83:</ref>W alone reads <hi rend="it">enpoisone</hi>. All other <hi rend="bold">B</hi>manuscripts read <hi rend="it">poisone</hi>,except OC<hi rend="sup">2</hi> which have <hi rend="it">punyschen</hi>.</note>

See the Punctuation section, for how to display angled brackets in notes.

A provisional note one you intend to remove after some issue is resolved may take the following simplified form:

<note> Content of note.</note>

Note:

For notes attached to marginalia, formework, guideletters and other secondary matter in the transcription, see the section on Editorial Notes on Matter Other than Primary Text.

Editorial Notes on Matter Other than Primary Text

External Specifications:

<note>

Often, you will need to make a note on an element of the manuscript other than the main body of the text, such as formework, headers, trailers and marginalia. Such notes need to be kept to a consistent format as in any other note, except that their <ref> cannot be keyed to the ID of the <l> element in which they appear, since they do not in fact appear inside of an <l>.

Note:

Since ID values are best assigned by a script, they can be added to an edition at the final stage of work, rather than by hand as the editor works. <ref> content, however, should be added by the editor at the time the note is generated.

Four Basic Conventions: Headers, Interlinear Elements, Formework & Trailers

In such cases, we have developed a single convention with three variants: one for note elements appearing before or within the header of a passus, a second for elements appearing after the headers, either before the first line or between the <l> elements within a passus, a third for formework, and a fourth for those appearing in conjunction with trailers.

Note:

In every case detailed below, the note should be nested inside the element on which it comments, just as notes on passages inside of <l> elements are always nested inside the <l>.

Notes Before or Within a `<head>` Element

We encode notes appearing before or within the <head> element of a passus with the fictitious line number of zero (0) as the content of its <ref> element. Thus, the note in MsM on the marginalium Assit principio... that appears before the first line of the poem is encoded with the content of the <ref> element set to "MP.0:," and with ID's based on this fictitious line number zero (0):

<marginalia id="MP.0m1" place="topCenter" hand="handx">[element content]<note id="MP.0mn1" type="codicological" place="unspecified" anchored="yes"><ref>MP.0:</ref> The heading is written in a similar ink to that of the text...</note></marginalia>

The ID values are not as cryptic as they may seem. "MP.0m1" represents "Manuscript M, Prologue, line zero, marginalium number 1." Likewise, "MP.0mn1" represents "Manuscript M, Prologue, line zero, marginalium number 1, note number 1."

The note itself should be nested inside of the element on which it comments, in this case, inside the <marginalia> element. A note on the content of the header itself would be encoded on exactly the same model, and would also be nested inside of the <head> element:

<head id="M2.0h1">[element content]<note id="M2.0hn1 type="codicological" place="unspecified" anchored="yes"><ref>M2.0:</ref> No blank line follows this rubric, which is centered.</note></head>

Notes on Marginalia Appearing After a `<head>` Element

Notes that need to be made on marginalia appearing after the <head> element (generally between <l> elements) should be assigned the line number of the line nearest to them, with <marginalia> tags and their contents placed immediately above that line in the transcription.

<l id="M2.114" n="KD2.112">Munde e Mellere...</l>

<marginalia id="M2.115m1" place="marginRight" hand="hand3">[marginalium]<note id="M2.115mn1"><ref>M2.115:</ref>[note]</note></marginalia>

<l id="M2.115" n="KD2.113">In e date...</l>

Note:

In this example and the one following, line wraps have been added to clarify where elements begin and end. Additional line wraps may also be invoked by your browser if your screen is set to a resolution below 1024x768. In no case should such extra "hard returns" be added to your transcription.

Notes on Formework Appearing After a `<head>` Element

Formework appearing at the top of a leaf takes the ID of the first line on the leaf, but the catchwords and signatures of various kinds at the foot of a leaf take the number of the last <l> on the leaf:

<l id="M2.130" n="KD2.128">Ȝe shul abiggen it boe . by god at me made . </l> </lg> <fw id="M2.130fw1" type="catch" place="bottomRight">Wel ȝe wyten wernardus</fw> <fw id="M2.130fw2" type="cor" place="bottomRight"><hi rend="ur">ex<expan>aminatur</expan></hi></fw> <fw id="M2.130fw3" type="cor" place="bottomLeft">coret</fw> <fw id="M2.130fw4" type="cor" place="bottomCenter">coret</fw> <fw id="M2.130fw5" type="quire signature" place="bottomRight">I<expan>us</expan></fw> <milestone n="9r" unit="fol." entity="B.M9r"/> <fw id="M2.131fw1" type="runningHead" hand="hand5" place="topRight">ij<expan>us</expan> p<expan>assus</expan></fw> <lg type="strophe" org="uniform" sample="complete"> <l id="M2.131" n="KD2.129"><note id="M2.131n1" type="codicological"><ref>M2.131:</ref> The <//> is to indicate...</note> Wel...faille</l>

Notes on Trailers (`<trailer>`)

Since trailers follow the last line contained within a <div>, and are typically the last element to appear within any <div>, they cannot take as part of their ID value or <ref> content the line number of a following line. Hence, they simply take that of the preceding one, following the rest of the conventions exactly as in the other cases:

<trailer id="M20.386t1"><foreign lang="lat"><hi rend="display"><hi rend="tr">E</hi>xplicit hic dialogus...</hi></foreign> <note id="M20.386tn1" type="textual"><ref>M20.386:</ref> This form of explicit...</note></trailer> <trailer id="M20.386t2"><foreign lang="lat"><hi rend="display">Penna precor...;</hi></foreign><note id="M20.386tn2" type="textual"><ref>M20.386:</ref> The Colophons de...</note></trailer>

Marginalia

External Specifications:

<marginalia> is a PPEA extension element.

Note:

For a detailed discussion of the distinctions between marginalia, formework and corrector's marks, see the >Important Distinctions section of the General Introduction.

The <marginalia> element is used to tag matter not intended to be part of the original poetic text nor forme work. That would include rubrics or glosses intended by the original scribe to be part of the original page as well as annotations, rubrics, glosses, etc. that are supplied by later hands.

Attributes for the marginalia tag include the following:

"place" This attribute should always be supplied. For place, use only the following values:

inline
supralinear
sublinear
marginLeft
marginRight
topLeft
topCenter
topRight
bottomLeft
bottomCenter
bottomRight

"hand" This attribute should always be supplied. It identifies the scribe responsible for the marginalia.
"id" A unique identifier. We use these for hypertextual linkages involving marginalia, so they must always be present. Under normal circumstances, we will add them in Charlottesville after completion of the edition.
"type" Identifies the type of marginalia. No immediate plans for implementation.

If the marginal material is pictorial, the most common being a pointing hand, use <figDesc> tags within <figure> tags within <add> tags in a note:

<note type="codicological">A scribe has drawn a <add place="marginRight" hand="handx"><figure><figDesc>pointing hand</figDesc></figure></add>in the right margin.</note>

Note:

Notes pertaining to any feature of the text contained within marginalia tags must also be contained within the marginalia tags. Many, if not all, will be codicological.

Note:

Each <marginalia> tag must be placed directly above the <l> tag of the first line to which it is pertinent. In the case of marginalia pertaining to several lines of text, put the <marginalia> tag above the first line to which it pertains and supply a codicological note inside the <marginalia> tag explaining which additional lines it might be applicable to. In cases where two or more marginal comments are associated with a line, each must be recorded in its own marginalia element.

Identifying Hands

External Specifications:

<handShift/>

For each manuscript, we will identify and describe as many contributing scribal hands as we can distinguish with confidence. A hand recognizably the same in two or more additions or changes to the text (whether by way of marginalia or corrections) should be given an identifying number, and that number will be identified in the document header.

The primary copyists in each manuscript are designated in order of appearance as hand1, hand2, etc., in order of appearance.

A hand not thus characterized will be labeled "handx" in the SGML or XML tags (as neither SGML nor XML permits the use of a question mark, as in "hand?").

Beyond that, editors may use whatever designations they find useful. For example, if the editor is unable to recognize repeated instances of materials written by a hand, various hands may be lumped together, either simply as "handx" or identified by century or style. For instance, one might label one of several fifteenth-century hands as "hand15x" or secretary hands as "hand16saecx" or some other convenient designator.

In cases in which single instances of hands from entirely different, clearly identifiable eras appear, a designation such as hand19 and hand16 might be clearer than a simple handx referring to all such hands taken together. The addition of "x" to the hand designation is intended to make clear the ambiguity of the identification.

Note:

Handx as a hand identifier can be used as a simple place holder pending later decisions.

Revision Dates

Revised on the following dates: 13 January 1994, 29 November 1995, 18 January 1997, 19 May 1997, 3 November 1997, 1 June 1998, 30 October 1998, 25 January 1999, 26 March 1999, 17 June 1999, 28 March 2001, 16 April 2001, 5 June 2001, 19 June 2003, 18 October 2004, 28 July 2005, 5 September 2005, September 23, 2005. Markup revisions: 23-25 May 2017.

Transcriptional Protocols General Introduction

Note:

What To Do First

Conventions Used in This Documentation

External Specifications:

Standard values for the "agent" attribute of the <dig> tag:

Note:

Beginning the Transcription

Version Control and Copy of Record

Note:

A New File for Each Passus

External Specifications:

Headers and Closing Tags for Each Passus

External Specifications:

Note:

Using NoteTab Pro from the Start

Text

Individual Characters and Graphs

Thorn, Yogh, and Other Non-ASCII Characters

External Specifications:

Note:

A Special Case

Allographs

Note:

Note:

Upper- and Lower-Case Letters

Note:

Proper Nouns

Punctuation

Use of Entity References

Editorial Punctuation

Note:

Shadow Hyphens and the TEI <seg> Element

Note:

Types of Punctuation that Are Marked Up

Tilded and Plain Ampersand

Note:

Note:

Displaying Angle Brackets as Such

Un-rubrished Paraph Indicators

Note:

Roman Numerals

External Specifications:

Ambiguous or Illegible Characters

External Specifications:

Sample tagging:

Spaces and Gaps

External Specifications:

Note:

Sample tagging:

Stray Marks, Blots, Stains, and Flourishes

External Specifications:

Words

Word Division

External Specifications:

Overview

Examples:

Markup

Note:

Compounds

Close Calls

Note:

Participles

Note:

Allegorical Names

Abbreviations

External Specifications:

Note:

Note:

Word Brevigraphs

External Specifications:

Note:

Note:

Language Shifts

External Specifications:

Note:

Note:

Unique Readings

External Specifications:

Layout

Transcriptional Protocols
General Introduction

Standard values for the "agent" attribute of the `<dig>` tag:

Shadow Hyphens and the TEI `<seg>` Element

Three Special Instances of `<hi>` Tag Use

`<add>` and `<addSpan>`

`<addSpan>`

`<head>` and `<trailer>`

`<fw>`

`<marginalia>`

Editorial Notes within `<l>` Tags

IDs and `<ref>` Tags for Notes within `<l>` Tags