Centrum voor Teksteditie en Bronnenstudie
Centre for Scholarly Editing and Document Studies
a research centre of the Royal Academy of Dutch Language and Literature
7. Modifications to TEI element classes
Up: Contents Previous: 6. Correlations of logical and physical structures
In the course of test-tagging real correspondence material with a draft of the DALF DTD, the distribution of some TEI elements proved too narrow to cater for the encoding of the corresponding textual phenomena where they occur. On a documentary level, additions for example can occur before any other textual element in a letter, and closing formulae can contain notes or note markers. However, TEI elements like <add> and <note> are not allowed in those contexts.
Since we aim at a faithful representation of the documentary source with the DALF encoding scheme, we followed the suggestions made on various occasions in the TEI public mailing list [note6] (see http://listserv.brown.edu), namely to redefine some TEI elements as global elements. Technically, this solution is explained in great detail in Syd Bauman's message posted on 10 Juni 2002, and illustrated in the documentation of the changes regarding element class manipulations in the Women Writers Project (see http://www.tei-c.org/Applications/Samples/wwp-mods-classes.html).
The DALF encoding scheme redefines the following TEI elements as global elements:
A first step in this process consists of the extension of the `x-dot entity' for the Incl element class with these elements, as explained in section 3.7.2. Classes Used in Content Models and section 29.1.3. Class Extension of the TEI P4 Guidelines. Together with the other global DALF elements already discussed in these guidelines, the new DALF declaration for the Incl element class in the file DALFExtns.ent looks as follows:
<!ENTITY % x.Incl 'add | deco | figure | paraph | print | layerStart | layerEnd | note | seg |'> <!ENTITY % m.Incl "%x.Incl; %n.anchor; | %m.editIncl; | %m.metadata; | %m.refsys;">Contrary to the declaration of new DALF elements as members of Incl, the redefinition of already existing TEI elements can imply several other changes, such as the removal of those elements from the classes and elements where they are defined in the TEI DTD, in order to avoid the danger of an invalid DTD with lots of "ambiguous content models". The next sections discuss the motivations for the redefinition of the TEI elements mentioned, and the changes introduced by this process.
Additions can occur anywhere on a letter, since additions originate from a "second-level" writing act, in which the "secondary" author does not necessarily consider the original letter as such. Additions are not necessarily written in "letter mode" and thus not necessarily follow its structural conventions.
In the TEI DTD, <add> is defined as member of the edit element class. Consequently, the declaration of <add> as member of the more encompassing Incl element class in the DALF DTD mandates its deletion from edit. Therefore, the edit element class is redefined in the file DALFExtns.ent in the following way:
<!ENTITY % m.edit "app | corr | damage | del | orig | reg | restore | sic | space | supplied | unclear">
Figures can occur in "bare" form (i.e. outside other containing <body>-level elements) anywhere on a letter, since in some cases they originate from a "non-letter mode", e.g. when a distracted author/receiver makes marginal drawings that are (rather) unrelated to the contents and/or logical structure of the letter.
In the TEI DTD, <figure> is defined as member of the common[note7], inter, and tpParts element classes. The declaration of <figure> as member of the Incl element class in the DALF DTD means that the common, inter and tpParts element classes should be redefined without it. The following lines illustrate these changes made in the file DALFExtns.ent:
<!ENTITY % x.common 'calc |'> <!ENTITY % m.common " calc | bibl | biblFull | biblStruct | ab | eTree | graph | l | lg | p | sp | tree | witList | cit | q | quote | label | list | listBibl | witDetail | stage | table">
<!ENTITY % x.inter 'calc |'> <!ENTITY % m.inter "calc | bibl | biblFull | biblStruct | castList | cit | q | quote | label | list | listBibl | witDetail | stage | camera | caption | move | sound | tech | view | table | text">
<!ENTITY % x.tpParts 'calc |'> <!ENTITY % m.tpParts "%x.tpParts; %n.byline; | %n.docAuthor; | %n.docDate; | %n.docEdition; | %n.docImprint; | %n.docTitle; | %n.epigraph; | %n.figure; | %n.imprimatur; | %n.titlePart;">
Authorial notes will only occur inside other text elements, yet in broader contexts than those defined by the TEI element class notes. For example, an author may make a note to the address line of an opening formula, or to the title of a work mentioned in the letter.
In the TEI DTD, <note> is defined as member of the biblPart, notes, terminologyInclusions, and dictionaryTopLevel element classes. Of these, only the first two are used in the DALF DTD. The transformation of <note> to a global element thus involved its removal from the classes biblPart and notes. Those element classes are defined in the file DALFExtns.ent as follows:
<!ENTITY % m.biblPart "analytic | author | biblScope | edition | editor | extent | idno | imprint | monogr | pubPlace | publisher | respStmt | series">
<!ENTITY % m.notes "witDetail">
A global <note> element caused further difficulties with the content models of some other TEI elements. The original TEI declarations of the elements <biblStruct> and <monogr> explicitly mentioned <note> in their content models. With the declaration of <note> as member of the Incl element class those content models became ambiguous. Therefore, they were redefined in the file DALFExtns.dtd without the <note> element in the following way:
<!ELEMENT biblStruct %om.RO; ((%m.Incl;)*, (analytic, (%m.Incl;)*)?, ((monogr, (%m.Incl;)*), (series, (%m.Incl;)*)*)+, (idno, (%m.Incl;)*)*)> <!ATTLIST biblStruct %a.global; %a.declarable;>
<!ELEMENT monogr %om.RO; (( (%m.Incl;)*, (( (author | editor | respStmt), (author | editor | respStmt | %m.Incl;)*, (title, (%m.Incl;)*)+, ( (editor | respStmt), (%m.Incl;)* )* ) |( (title, (%m.Incl;)*)+, ((author | editor | respStmt), (%m.Incl;)*)* )))?, ((meeting), (%m.Incl;)*)*, (edition, (editor | respStmt | %m.Incl;)*)*, imprint, (imprint | extent | biblScope | %m.Incl;)* )> <!ATTLIST monogr %a.global;>
An encoding scheme for primary manuscript materials should be able to cater for the documentary transcription of phenomena as they occur. Of course, there may be as many different structures as there are letters one may wish to encode. Yet, an encoding scheme that would anticipate this by allowing every element in the broadest possible contexts, would be minimally expressive. A scheme with too many global elements would not add much more to structural insight than the letter itself. Therefore, some level of abstraction should be maintained. This rationale motivated the declaration of <seg> as global element in the DALF DTD. Its multi-purpose semantics, namely "to mark any segments of the text of interest for processing", and its general content model make it an interesting intermediary global container element that can wrap up other loose text or text elements on places where no text or other elements are allowed. In this way, the number of global elements is not over-exaggerated, while the expressive capacity of the encoding scheme is adapted to the irregular structure of the documents it is designed to represent.
In the TEI DTD, <seg> is defined as member of the seg element class. Consequently, the declaration of <seg> as member of the more encompassing Incl element class in the DALF DTD mandates its deletion from seg. The following lines illustrate this change made in the file DALFExtns.ent:
<!ENTITY % m.seg "c | cl | m | phr | s | w">
A global <seg> element caused further difficulties with the content models of another TEI element. The original TEI declaration of the element <sp> explicitly included <seg> in its content model. With the declaration of <seg> as member of the Incl element class that content model became ambiguous. Therefore, it was redefined in the file DALFExtns.dtd without the <seg> element in the following way:
<!ELEMENT sp %om.RO; ((%m.Incl;)*, (speaker, (%m.Incl;)*)?,((p | l | lg | ab | stage), (%m.Incl;)*)+)> <!ATTLIST sp %a.global; who IDREFS #IMPLIED>
Up: Contents Previous: 6. Correlations of logical and physical structures