Digital Talking Book Expanded Document Type Definition Documentation for Version V110

Date: 2002-02-27

Author: Harvey Bingham

Introduction

This informative document expands upon the Digital Talking Book XML Document Type Definition Implementing the ANSI/NISO Digital Talking Book V1.1.0 Document. See [DTBOOKV110DTD]



1. Purpose

The Digital Talking Book Document Type Definition (DTD) provides the means to mark up the text of a document to permit support for the combination of professional narration and navigation into that narration. It also facilitates the output of a document's content in a variety of accessible formats. The markup tags in the book convey its content in structure, and contain some metadata about the book content and its structure.

The Document Type Definition names and defines the allowable element types, their allowable content, and their attributes. Correct markup of the text of the book permits the textual material to be synchronized using SMIL [SMIL2.0] files with the professionally narrated version of that book. The synchronization can permit concurrent display of the text being narrated. The textual content can be searched in context to locate material desired for narration.

The source DTD from which this document has been generated is dtbook110.dtd. See [DTBOOKV110DTD].

Go to the Table of Contents.


1.1. Prior Related Work

The DAISY (Digital Audio-based Information SYstem) Consortium contributed substantially to the development of this DTD. This application of XML is the next generation after several DAISY versions of 2.X specifications, see [DAISY202].

The DAISY Statement of Principles for the Creation and Production of Accessible Books and Materials [DAISY-2-GUIDELINES] represents the minimum standard to be met by Libraries of the Blind and producers of alternative format materials.

Its Navigation Control Center (NCC) provided for synchronizing document structure with narration.

The NCC evolved into an XML application called the "Navigation Control File for XML applications" (NCX). Its content is derived from the markup of documents tagged using the dtbook DTD. Richer structuring capability is one of the objectives of that DTD. The Synchronized Multimedia Integration Language [SMIL2.0] is used to provide synchronized narrations and text. The NCX provides navigation using the identified elements of documents tagged to this DTD.

The dtbook DTD includes many, but not all, of the element types found in both the [HTML401STRICT] and [XHTML11STRICT] strict DTDs. HTML authoring tools permit those additional element tags, and may ignore the additional tags that are dtbook-specific. The lowercase names from XHTML are used, rather than the uppercase names from HTML.

Go to the Table of Contents.


1.2. Evolution from HTML

Dtbook110 has 79 element types. It shares 47 element types with the HTML4.0 Strict DTD [HTML401STRICT] (as adjusted to use the lower-case names consonant with the XHTML Strict DTD [XHTML11STRICT]). It omits 30 element types from them, and has 32 unique element types.

Endtag markup is sometimes optional in HTML. It is required for use with xhtml and dtbook. Any XML application [XML12] requires endtags, or their abbreviated form for empty elements, such as "<br />". The benefit of including endtags is that the tagged document has dependable structure that can be validated against the dtbook dtd.

Some tools available for browsing HTML may be used with dtbook material, at the expense of their discarding or ignoring some specific tagging and attributes that are not part of HTML 4.0. A CSS-based stylesheet [CSS1] or [CSS2] that identifies the presentation expectations for the HTML and non-HTML tags, or a filter to map those tags onto suitable HTML tags can provide appropriate visual presentation.

Go to the Table of Contents.


2. Document Tagging Content

A Digital Talking Book document is an XML application. Therefore, it must begin with the XML processing instruction, followed by the DOCTYPE declaration.


2.1. XML Processing Instruction Content

The XML Processing Instruction identifies the version of XML, and the optional character set encoding for the document:

<?xml version="1.0" encoding="UTF-8" ?>

Go to the Table of Contents.


2.2. Character Set Encodings

The character set in which the document is encoded is identified by one of a number of strings. All XML applications are expected to be able to recognize the UNICODE/ISO/IEC 10646 encodings "UTF-8" and "UTF-16" [ISO10646].

Some alternative encodings to "UTF-8" (or "ISO-10646-UCS-2") or "UTF-16" (or "ISO-10646-UCS-4") may be used. These include "ISO-8859-1", "ISO-8859-2", ... "ISO-8859-9" for parts of ISO 8859. See [ISO8859]. Note that US-ASCII (i.e. encoding all characters over decimal 127, e.g. from 128 to 255, as &#nnn;) is conformant with UTF-8 (and ISO-8859-1, HTTP's default header encoding.)

Also, the values "ISO-2022-JP", "Shift_JIS", or "EUC-JP" can be used for various Japanese encoded forms of JIS X-0208-1997. See [JIS].

The Unicode characters may be represented as their code points, using the form &#hHHHH; where HHHH is a hexadecimal value formed from the digits 0-9 and letters A-F. Any initial H with value "0" may be elided.

Go to the Table of Contents.


2.3. DOCTYPE Declaration

The document type declaration, the DOCTYPE, follows. It has several forms. The simpler form assumes that the proper version of the dtbook DTD is in the same directory as the dtbook file itself.

<!DOCTYPE dtbook SYSTEM
"dtbook110.dtd">

A more general form provides the PUBLIC URI from which the SYSTEM filename can be substituted, should that system copy be missing:

<!DOCTYPE dtbook PUBLIC
"http://www.loc.gov/nls/z3986/v100/dtbook110.dtd"
"dtbook110.dtd">

That assumes the URI can be reached, which may not be true for portable dtbook players.

The still more general form recommended for xml applications [XML12] is:

<!DOCTYPE dtbook PUBLIC
"-//NISO//DTD dtbook v1.1.0//EN"
"dtbook110.dtd">

where the Formal Public Identifier (FPI) on the second line is converted to the URI where it may be publicly found:

http://www.loc.gov/nls/z3986/v100/dtbook110.dtd

The [OASIS-TR9401] Entity Management Catalog provides an indirect means to provide that mapping from FPI to the dtd.

That catalog is more generally useful to provide the mapping from any external entity names (such as modules) to URIs where they may be found.

Note that the reference above is to a particular version of the DTD, distinguished by the "v110".

Go to the Table of Contents.


2.4. Digital Talking Book File MIME Type

A Digital Talking Book document is tagged to the dtbook XML application. Its MIME media-type is "text/xml". The tagged book filename should have suffix ".xml". See [RFC2045].

Go to the Table of Contents.


3. Element Types

The element types comprising the dtbook DTD provide the syntactic elements from which the logical structure and content of the book is identified. In this section are the following subsections:

3.1. Element Structural Groupings
3.2. Element Usage Descriptions in Alphabetic Order
3.3. Element Declaration Form
3.4. Attribute List Declaration Form
3.5. Parameter, General, and Character Entity Declaration Forms
3.6. Links to Details
3.7. Element Types and Their Attributes
3.8. Element Type Use in Other Content Models
3.9. Parameter, General, and Character Entity Declaration Forms

3.2. Element Usage Descriptions in Alphabetic Order

The element types in the dtbook DTD provide the means to distinguish among the semantic uses of the various kinds of book content.

The element names are hyperlinked to their corresponding detailed descriptions.

Further information about the element types that come from the [HTML401STRICT] application (possibly qualified) is available from that resource.

The other element types are added for dtbook.

ElementDescription
a contains an anchor, which is used to reference another location, within the same or another <dtbook>. [XHTML11STRICT]
abbr designates an abbreviation, a shortened form of a word. For examples: Mr., approx., lbs., rec'd. Contrast with <acronym>. [XHTML11STRICT]
acronym marks a word formed from key letters (usually initials) of a group of words. For examples: UNESCO, NATO, XML, US. Contrast with <abbr>. [XHTML11STRICT]
address contains a location at which a person or agency may be contacted. By use of <line> to contain content of the individual lines, the class attribute can be used to identify the content of that <line>. For example, class values might include: name, address, region (state. province, etc.), country, location code (such as zipcode, provincial code), phone, fax, email, etc. [XHTML11STRICT]
annoref marks a text segment that references an <annotation>. Each <annoref> is usually a word, phrase, or whole line that is part of the surrounding text (identified in the original print book by bolding, italics, etc.). It should not normally be allowed to be turned off in a DTB application.
annotation is a comment on or explanation of a portion of a printed book. It differs from <note> in that an <annotation> is usually set in the margin or on a facing page, often with no explicit reference to it inserted in the text. Any local reference to <annotation id="xxx"> is by <annoref idref="#xxx">.
author identifies the writer of a work other than this one. Contrast with <docauthor>, which identifies the author of this work. <author> typically occurs within <blockquote> or <cite>.
bdo is used in special cases where the automatic actions of the bi-directional algorithm would result in incorrect display. [XHTML11STRICT]
blockquote indicates a block of quoted content that is set off from the surrounding text by paragraph breaks. Compare with <q>, which marks short, inline quotations. [XHTML11STRICT]
bodymatter consists of the text proper of a book, as contrasted with preliminary material <frontmatter> or supplementary information in <rearmatter>.
book surrounds the actual content of the document, which is divided into <frontmatter>, <bodymatter>, and <rearmatter>. <head>, which contains metadata, precedes <book>.
br marks a forced line break. [XHTML11STRICT]
caption describes a <table> or <img>. If used with <table> it must follow immediately after the <table> start tag. If used with <img> or <imggroup> it is not so constrained. [XHTML11STRICT]
cite marks a reference (or citation) to another document. [XHTML11STRICT]
code designates a fragment of computer code. [XHTML11STRICT]
col elements define the alignment properties for cells in one or more columns. [XHTML11STRICT]
colgroup groups adjacent columns <col> that are semantically related. [XHTML11STRICT]
dd marks a definition of the preceding term <dt> within a definition list <dl>. A definition without a preceding <dt> has no semantic interpretation, but is visually presented aligned with other <dd>. [XHTML11STRICT]
dfn marks the first occurrence of a word or term that is defined or explained there or elsewhere in <book>. Often <dfn> is rendered in italics, sometimes in parentheses. [XHTML11STRICT]
div is a generic container for subdivisions of a book. The <level1> ... <level6> hierarchy, or the <level> tag used recursively, should mark the major hierarchical structures of a book, while <div> is used in less formal circumstances or when for production purposes it is desired that a structure should be treated differently. Compare with <span>, which is used in inline settings. [XHTML11STRICT]
dl contains a definition list, usually consisting of pairs of terms <dt> and definitions <dd>. Any definition can contain another definition list. [XHTML11STRICT]
docauthor marks each author or editor of this work. Compare with <author>, used to mark the author of another work, within <blockquote> or <cite>.
doctitle marks the title of the book within <frontmatter>. By convention <doctitle> should appear only once. Contrast with <title>, which occurs as metadata in <head> and whose content is generally the same.
dt marks a term in a definition list <dl> for which a definition <dd> follows. [XHTML11STRICT]
dtbook is the root element in the Digital Talking Book DTD. <dtbook> contains metadata in <head> and the contents itself in <book>.
em indicates emphasis. Usually <em> is rendered in italics. Compare with <strong>. [XHTML11STRICT]
frontmatter usually contains <doctitle> and <docauthor>, as well as preliminary material that is often enclosed in appropriate <level> or <level1>. Content may include copyright notice, foreword, acknowledgments, table of contents, etc. <frontmatter> serves as a guide to the content and nature of a <book>.
h1 contains the text of the heading for a <level1> structure. [XHTML11STRICT] but nested
h2 contains the text of the heading for a <level2> structure. [XHTML11STRICT] but nested
h3 contains the text of the heading for a <level3> structure. [XHTML11STRICT] but nested
h4 contains the text of the heading for a <level4> structure. [XHTML11STRICT] but nested
h5 contains the text of the heading for a <level5> structure. [XHTML11STRICT] but nested
h6 contains the text of the heading for a <level6> structure. [XHTML11STRICT] but nested
hd marks the text of a heading in a <list> or <sidebar>.
head contains metainformation about the book but no actual content of the book itself, which is placed in <book>. This information is consonant with the <head> information in xhtml, see [XHTML11STRICT]. Other miscellaneous elements can occur before and after the required <title>. By convention <title> should occur first. [XHTML11STRICT]
hr is an empty element, minimally <hr />, indicating a horizontal rule. It may be used to indicate a break in the text where only blank lines, a row of asterisks, a horizontal line, etc. are used in the print book. [XHTML11STRICT]
img marks a visual image. An <img> will always contain an alt and generally contain a longdesc, a pointer to a related <prodnote>. The <img> may be referenced by a <caption> or <prodnote>, using, for example, the form <caption imgref="#yyy">the Caption</caption> for the <img id="yyy">. [XHTML11STRICT]
imggroup provides a container for one or more <img> and associated <caption>(s) and <prodnote>(s). A <prodnote> may contain a description of the image. The content model allows: 1) multiple <img> if they share a caption, with the ids of each <img> in the <caption imgref="id1 id2 ...">, 2) multiple <caption> if several captions refer to a single <img id="xxx"> where each caption has the same <caption imgref="xxx">, 3) multiple <prodnote> if different versions are needed for different media (e.g., large print, braille, or print). If several <prodnote> refer to a single <img id="xxx">, each prodnote has the same <prodnote imgref="xxx">.
kbd designates information that the reader is to input directly into a computer using the keyboard. [XHTML11STRICT]
level is an alternative tag for marking the major structures in a book. It may be used recursively, i.e., repeated indefinitely with each successive occurrence nesting within the previous. It may also be included in a subsequent higher level. Subordinate levels have greater depth. Contrast with the explicit <level1>...<level6> elements, which may not be intermixed with <level>.
level1 is the highest-level container of major divisions of a book. Used in <frontmatter>, <bodymatter>, and <rearmatter> to mark the largest divisions of the book (usually parts or chapters), inside which level2 subdivisions (often sections) may nest. The class attribute identifies the actual name (e.g., part, chapter) of the structure it marks. Contrast with <level>.
level2 contains subdivisions that nest within <level1> divisions. The class attribute identifies the actual name (e.g., subpart, chapter, subsection) of the structure it marks.
level3 contains sub-subdivisions that nest within <level2> subdivisions (e.g., sub-subsections within subsections). The class attribute identifies the actual name (e.g., section, subpart, subsubsection) of the subordinate structure it marks.
level4 contains further subdivisions that nest within <level3> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks.
level5 contains further subdivisions that nest within <level4> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks.
level6 contains further subdivisions that nest within <level5> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks.
levelhd contains the text of a heading within <level>. Corresponds to <h1> through <h6> used in <level1> through <level6>.
li marks each list item in a <list>. <li> content may be either inline or block and may include other nested lists. Alternatively it may contain a sequence of list item components, <lic>, that identify regularly occurring content, such as the heading and page number of each entry in a table of contents. [XHTML11STRICT]
lic ("list item component") allows ordered substructure within a list item <li>. Used when a list item is made up of two or more components, as in a table of contents entry. The same number of <lic> should occur in each <li>. If not, correspondence of <lic> in different <li> is in order of occurrence for the current writing direction of the <li>.
line marks a single logical line of text. Often used in conjunction with <linenum> in documents with numbered lines.
linenum contains a line number, for example in legal text.
link is an empty element appearing in the <head> section of a document that establishes a connection between the current document and another document. The <link> element conveys relationship information (for example, "next" and "previous") that may be rendered by user agents in a variety of ways. [XHTML11STRICT]
list contains some form of list, ordered or unordered. The list may have intermixed heading <hd> (generally only one, possibly with <prodnote>) and an intermixture of list items <li> and <pagenum>. If bullets and outline enumerations are part of the print content, they are expected to prefix those list items in content, rather than be implicitly generated. Note: XHTML has explicitly distinguished list element types: ol for ordered, and ul for unordered.
meta indicates metadata about the book. It is an empty element that may appear repeatedly only in <head>. [XHTML11STRICT]
note marks a footnote, endnote, etc. Any local reference to <note id="yyy"> is by <noteref idref="#yyy">.
noteref marks one or more characters that reference a footnote or endnote <note>. Contrast with <annoref>. <noteref> and <note> are independently skippable.
notice contains a warning, caution, or other type of admonition normally found in the margin of a book. In contrast with <sidebar> a <notice> must be presented at a specific location within the text. Its presentation is not optional.
p contains a paragraph, which may contain subsidiary <list> or <dl>. [XHTML11STRICT]
pagenum contains one page number as it appears from the print document, usually inserted at the point within the file immediately preceding the first item of content on a new page.
prodnote contains language added to the alternative-format version by the producer; commonly used to: 1) provide descriptions of one or more visual elements such as charts, graphs, etc. 2) supply operating instructions 3) describe differences between the print book and the audio version.
q contains a short, inline quotation. Compare with <blockquote>, which marks a longer quotation set off from the surrounding text. [XHTML11STRICT]
rearmatter contains supplementary material such as appendices, glossaries, bibliographies, and indices. It follows the <bodymatter> of the book.
samp contains a sample of work created by the author for use as an example or template. For example, a sample business letter, resume, computer program output, or form. [XHTML11STRICT]
sent marks a sentence.
sidebar contains information supplementary to the main text and/or narrative flow and is often boxed and printed apart from the main text block on a page. It may have a heading <hd>.
span is a generic container for use in inline settings when no specific tag exists for a given situation. The class attribute may describe the nature of the text it marks (e.g., a typographical error). May be used to mark a class of items to which styles are to be applied. Compare with <div>, which is used in a block settings. [XHTML11STRICT]
strong marks stronger emphasis than <em>. Visually <strong> is usually rendered bold. [XHTML11STRICT]
style provides the means to include styling information that applies to the book. It may appear only in <head>. It may include CDATA sections. [XHTML11STRICT]
sub indicates a subscript character (printed below a character's normal baseline). Can be used recursively and/or intermixed with <sup>. [XHTML11STRICT]
sup marks a superscript character (printed above a character's normal baseline). Can be used recursively and/or intermixed with <sub>. [XHTML11STRICT]
table contains cells of tabular data arranged in rows and columns. A <table> may have a <caption>. It may have descriptions of the columns in <col>s or groupings of several <col> in <colgroup>. A simple <table> may be made up of just rows <tr>. A long table crossing several pages of the print book should have separate <pagenum> values for each of the pages containing that <table> indicated on the page where it starts. Note the logical order of optional <thead>, optional <tfoot>, then one or more of either <tbody> or just rows <tr>. This order accommodates simple or large, complex tables. The <thead> and <tfoot> information usually helps identify content of the <tbody> rows, For a multiple-page print <table> the <thead> and <tfoot> are repeated on each page, but not redundantly tagged. [XHTML11STRICT]
tbody marks a group of rows in the main body of a <table>. If the <table> is divided into several sections, each consisting of a number of rows, each section would be separately tagged with <tbody>. The same <thead> and <tfoot> apply to every <tbody> section. Use multiple <tbody> sections when rules are needed between groups of table rows. [XHTML11STRICT]
td indicates a table cell containing data. [XHTML11STRICT]
tfoot marks footer information in a <table>, consisting of one or more rows <tr>, usually of <th> cells. Use <tfoot> to duplicate footers when breaking table across page boundaries, or for static footers when <tbody> sections are rendered in scrolling panel. [XHTML11STRICT]
th indicates a table cell containing header information. [XHTML11STRICT]
thead marks header information in a <table>, consisting of one or more rows <tr> of <th> cells. Use <thead> to duplicate headers when breaking table across page boundaries, or for static headers when <tbody> sections are rendered in scrolling panel. [XHTML11STRICT]
title contains the title of the book but is used only as metainformation in <head>. Use <doctitle> within <book> for the actual book title, which will usually be the same. [XHTML11STRICT]
tr marks one row of a <table> containing <th> or <td> cells. [XHTML11STRICT]
w marks a word.

Go to the Table of Contents.


3.3. Element Declaration Form

For each of the element declarations the following information is supplied:

  1. <!ELEMENT elementname>
  2. Description. Repeats the description shown in Section 3.2. If the element is from HTML 4.0 strict, the suffix appears: [XHTML 4.0]
  3. Contains: original element content model.
  4. Expanded: only if the original has any parameter entity references in its content model, they are fully-expanded.
  5. Occurs within: list of element names in which this element may (or must) occur directly, in their content models.
  6. Attributes: Original: attribute list
  7. Expanded: only if the original has any parameter entities.
  8. Explanation of any unique attributes.

Element Declarations are shown as:

<!ELEMENT elementname
Contains:
EMPTY or original content model:
Expanded:
content model with parameter entities expanded
>

EMPTY declared content denotes that the element has no explicit content. Instead, its purpose is to mark a position, and to associate attribute values to that position. The XML-approved way to indicate the end-tag of such an element is by the special tag close ' />'. Some browsers presume that leading space (after the end of the final quoted value for an attribute).

For example, the XML-preferred way using the EMPTY horizontal rule tag:
<hr title="horizontal rule purpose" />
or for backward compatibility the explicit end tag form may be used:
<hr title="horizontal rule purpose"></hr>

Element content models are formed from a parenthesized list of names of other elements or parameter entities of the form %name; or #PCDATA, separated by connectors:

Connector Use Example
, sequence (x,y) is x followed by y
| alternative (x|y) is either x or y
(...) grouping (x,(y|z)) is x followed by either y or z

Individual names or groupings of them may have a following replicator. If there is no replicator, that means just one of what precedes it, possibly a parenthesized expression.

ReplicatorUse Example
? optional x? means zero or one of x
* optional and repeatable x* means zero or more of x
+ Repeatable x+ means one or more of x
none One only x means just one x

#PCDATA stands for parsed character data. The name suggests that the content may include other inline element tags intermixed as allowed in the content model.

#PCDATA can occur as an entire content model. Or, it can occur first among alternatives. For examples:

Content ModelMeaning
(x | y)+ Either x or y, repeatable, in any order
(x,y?) x, optionally followed by y
(x?,(y|z)*) Optional x, followed by optional and repeatable choice of x or y
(#PCDATA) Any text or character entities
(#PCDATA | x | y)+ #PCDATA, intermixed and repeatable, with x or y

The final example above with #PCDATA allows choice among 0 or more of x or y intermixed with text. Note that because the #PCDATA is present, even an empty string matches this model.

#PCDATA may also contain character entities that permit representing non-ASCII characters, intermixed with characters from the document character set. If this set is "ISO-8879-1" or the ASCII characters, a form for referencing such non-ASCII entities is:

&xHHHH;

where the HHHH denotes a Unicode hexadecimal code position.

Go to the Table of Contents.


3.4. Attribute List Declaration Form

An Attribute list has the form:

<!ATTLIST associatedElementName attributeList>

The associatedElementName identifies the corresponding element type.

The attributeList has one or more attributes. Each attribute has three parts:

1. Name of attribute as it may appear in a document tag; there quoted in one of the forms:

name="value"
name='value'.

Note that the value may include character entities such as the quoting entities themselves: quote &quot; (") and apostrophe: &apos; (').

2. Declared value of attribute, of various types:

Attribute KindExplanation
ID identifier, formed from letters (case sensitive), digits, dash, underscore, and period.
IDREF Value is one ID value
IDREFS Values are one or more space-separated ID values
CDATA character string, with the semantic meaning suggested by the parameter entity name.
(name1|name2|...) Select at most one among the alternatives

3. Default value of attribute, of various kinds:

Default ValueExplanation
#IMPLIED attribute and its value may be omitted, and if so, the meaning is up to the system.
#REQUIRED attribute and its value must be included.
quoted value One of the explicit names in the Declared value alternatives, using either quoting form: "value", or 'value'.
#FIXEDUnchanging explicit value shown.

The use of each unique attribute is explained.

Go to the Table of Contents.


3.5. Parameter, General, and Character Entity Declaration Forms

Three kinds of entity declarations may appear:

Parameter entities
appear in the DTD and possibly in the document internal subset.
General entities
may appear in the DTD or internal subset or within a document.
Character entities
may appear in the DTD or internal subset or within a document. Each content identifies a corresponding single character.

3.5.1. Parameter Entities

Parameter entities are used to define content in one place for reuse. The parameter entity form is:

<!ENTITY % pename "...">

The effect is to define pename to have the value "...".

Reference to that parameter entity is by:

%pename;

A parameter entity definition may contain other parameter entity references. See Section 3.9 Parameter Entity Declarations. for the parameter entities defined in this dtbook110 DTD.


3.5.2. General Entities

A general entity declaration has the form:
<!ENTITY gename "...">
A general entity is referenced using the prefix:
"&"
and suffix
";".
For example,
"&gename;"

Examples in the DTD (and every XML application DTD) are the special character entities.


3.5.3. Character Entities

The five following characters may have special markup meaning, so are expressed as character entities in text. They arerecognizable since they are preceded by "&" and followed by ";".

The notation below, #xHHHH (or #xHH) where H is a hexadecimal-number(formed from 0-9 and A-F), indicates the character code position in Unicode/ISO-10646 [ISO10646i].

Entity Declaration Hex Value ASCII Value Description
<!ENTITY lt "&#x0026;#x003C;"> "&#38;#60;" < Less than, normally starts a tag.
<!ENTITY gt "&#x003E;" > "&#62;" > Greater than, normally ends a tag.
<!ENTITY amp "&#x0026;#x0026;"> "&#38;#38;"& Ampersand, normally begins a character entity reference.
<!ENTITY apos "&#x0027;"> "&#39;" ' Neutral Quote, Apostrophe, if needed within an attribute string so quoted.
<!ENTITY quot "&#x0022;" > "&#34;" " Quotation mark, if needed within an attribute string so quoted.

Note that the "<" and "&" characters in the declarations of "lt" and "amp" above are doubly escaped to meet the requirement that entity replacement be well-formed.

As these character entities occur in the first plane of Unicode, with encodings the same as ASCII, the "00" prefix can be implied, so may be omitted.

Three larger character sets included in [HTML 4.0] are omitted here:

HTMLlat1.ent
HTMLsymbol.ent
HTMLspecial.ent

Unicode [ISO10646] is available to XML applications, so these characters are available, without the need for them.

The initial processing instruction that identifies dtbook as an XML application should use a more inclusive encoding, as described at the start of section 2.

Go to the Table of Contents.


3.6. Links to Details

Links are provided to the detailed descriptions for each elementname and parameter entity name in the detailed descriptions of the elements, their attlists, and the parameter entities that follow. When any parameter entity occurs in a declaration, a fully-expanded version of that declaration also appears.

The links to parameter entities in section 3.9 Parameter Entities Sorted are principally useful to get more detailed information about them.

Go to the Table of Contents.


3.7. Element Types and Their Attributes

Hereafter appear ordered top-down where sequential, or in the approximate document structural level the details for each of the 79 elements:

  • name
  • usage
  • original (and parameter entity expanded) content models
  • original (and expanded) attribute list
  • usage for the special attributes in the attribute list
  • other element content models that can be direct parent elements, sorted and listed.

Document Structure


dtbook

dtbook is the root element in the Digital Talking Book DTD. <dtbook> contains metadata in <head> and the contents itself in <book>.
<!ELEMENT dtbook
Contains:
(%dtbookcontent;)
Expanded:
(head,book)
>

May not occur in other element content models, as dtbook is the root element.

<!ATTLIST dtbook
Attribute Declared Value Default Value
Original:
version CDATA #FIXED '1.1.0'
%i18n;
Expanded:
version CDATA #FIXED '1.1.0'
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED


"version" is required, and contains the specific version of the dtd, so that the dtd version for any dtbook can be recognized.

"%i18n;" internationalization attributes characterize the <book>. Those values may be adjusted for language changes within it.
>

Go to the Element Structural Groupings.


Document Head Metadata


head

head contains metainformation about the book but no actual content of the book itself, which is placed in <book>. This information is consonant with the <head> information in xhtml, see [XHTML11STRICT]. Other miscellaneous elements can occur before and after the required <title>. By convention <title> should occur first. [XHTML11]
<!ELEMENT head
Contains:
((%headmisc;)*, title, (%headmisc;)*)
Expanded:
((style | meta | link)*,title,(style | meta | link)*)
>

May occur within the element content model:

dtbook
<!ATTLIST head
Attribute Declared Value Default Value
Original:
%i18n;
profile %URI; #IMPLIED
Expanded:
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
profile CDATA #IMPLIED


"profile" gives one or more whitespace-separated profile URI targets that may provide additional information about the current document.
>

title

title contains the title of the book but is used only as metainformation in <head>. Use <doctitle> within <book> for the actual book title, which will usually be the same. [XHTML11]
<!ELEMENT title
Contains:
(#PCDATA)
>

May occur within the element content model:

head
<!ATTLIST title
Attribute Declared Value Default Value
Original:
%i18n;
Expanded:
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
>


meta

meta indicates metadata about the book. It is an empty element that may appear repeatedly only in <head>. [XHTML11]
<!ELEMENT meta
Contains:
EMPTY
>

May occur within the element content model:

head
<!ATTLIST meta
Attribute Declared Value Default Value
Original:
%i18n;
http-equiv NMTOKEN #IMPLIED
name NMTOKEN #IMPLIED
content CDATA #REQUIRED
scheme CDATA #IMPLIED
Expanded:
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
http-equiv NMTOKEN #IMPLIED
name NMTOKEN #IMPLIED
content CDATA #REQUIRED
scheme CDATA #IMPLIED


"http-equiv" connects the content attribute value to an http header field.

"name" value identifies the specific kind of content value.

"content" indicates the value for that "name", possibly constrained by the semantics for the individual names.

"scheme" indicates a predetermined format for interpreting the content value, such as the Dublin Core.
>

style

style provides the means to include styling information that applies to the book. It may appear only in <head>. It may include CDATA sections. [XHTML11]
<!ELEMENT style
Contains:
(#PCDATA)
>

May occur within the element content model:

head
<!ATTLIST style
Attribute Declared Value Default Value
Original:
%i18n;
type %ContentType; #REQUIRED
media %MediaDesc; #IMPLIED
title %Text; #IMPLIED
xml:space (default | preserve) 'preserve'
Expanded:
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
type CDATA #REQUIRED
media CDATA #IMPLIED
title CDATA #IMPLIED
xml:space (default | preserve) 'preserve'


"type" indicates the MIME-Type [RFC2045]. Type value should be 'text/css', rather than 'text/javascript'.

"media" value indicates the media for stylesheet definition(s); if multiple, separated by commas.

"title" can provide menu choice among alternative stylesheets.

"xml:space" value='preserve' indicates that whitespace in the <style> content is preserved without need to include its value in each <style>. (xml:space='default' accepts system style adjustment, such as adding its own indenting.)
>

Go to the Element Structural Groupings.


Book Content


book

book surrounds the actual content of the document, which is divided into <frontmatter>, <bodymatter>, and <rearmatter>. <head>, which contains metadata, precedes <book>.
<!ELEMENT book
Contains:
(frontmatter?, bodymatter?, rearmatter?)
>

May occur within the element content model:

dtbook
<!ATTLIST book
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

Go to the Element Structural Groupings.


Book Major Structures


frontmatter

frontmatter usually contains <doctitle> and <docauthor>, as well as preliminary material that is often enclosed in appropriate <level> or <level1>. Content may include copyright notice, foreword, acknowledgments, table of contents, etc. <frontmatter> serves as a guide to the content and nature of a <book>.
<!ELEMENT frontmatter
Contains:
(doctitle | docauthor | level | level1 | %block;)+
Expanded:
(doctitle | docauthor | level | level1 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation)+
>

May occur within the element content model:

book
<!ATTLIST frontmatter
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

bodymatter

bodymatter consists of the text proper of a book, as contrasted with preliminary material <frontmatter> or supplementary information in <rearmatter>.
<!ELEMENT bodymatter
Contains:
(level | level1 | %block;)+
Expanded:
(level | level1 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation)+
>

May occur within the element content model:

book
<!ATTLIST bodymatter
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

rearmatter

rearmatter contains supplementary material such as appendices, glossaries, bibliographies, and indices. It follows the <bodymatter> of the book.
<!ELEMENT rearmatter
Contains:
(level | level1 | %block;)+
Expanded:
(level | level1 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation)+
>

May occur within the element content model:

book
<!ATTLIST rearmatter
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

Go to the Element Structural Groupings.


dtbook Recursive Structure level


level

level is an alternative tag for marking the major structures in a book. It may be used recursively, i.e., repeated indefinitely with each successive occurrence nesting within the previous. It may also be included in a subsequent higher level. Subordinate levels have greater depth. Contrast with the explicit <level1>...<level6> elements, which may not be intermixed with <level>.
<!ELEMENT level
Contains:
(levelhd | %block; | %inlineinblock; | level | doctitle | docauthor)+
Expanded:
(levelhd | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum | level | doctitle | docauthor)+
>

May occur within the element content models:

frontmatter bodymatter rearmatter level
<!ATTLIST level
Attribute Declared Value Default Value
Original:
%attrs;
depth CDATA #IMPLIED
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
depth CDATA #IMPLIED


"class" identifies the actual name (e.g., part, chapter, section, subsection) of the structure it marks.

"depth" indicates the nesting depth, starting at 1.
>

Go to the Element Structural Groupings.


dtbook Hierarchic Structure level1 ... level6


level1

level1 is the highest-level container of major divisions of a book. Used in <frontmatter>, <bodymatter>, and <rearmatter> to mark the largest divisions of the book (usually parts or chapters), inside which level2 subdivisions (often sections) may nest. The class attribute identifies the actual name (e.g., part, chapter) of the structure it marks. Contrast with <level>.
<!ELEMENT level1
Contains:
(h1 | level2 | %block; | %inlineinblock; | doctitle | docauthor)+
Expanded:
(h1 | level2 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum | doctitle | docauthor)+
>

May occur within the element content models:

frontmatter bodymatter rearmatter
<!ATTLIST level1
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

level2

level2 contains subdivisions that nest within <level1> divisions. The class attribute identifies the actual name (e.g., subpart, chapter, subsection) of the structure it marks.
<!ELEMENT level2
Contains:
(h2 | level3 | %block; | %inlineinblock;)+
Expanded:
(h2 | level3 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum)+
>

May occur within the element content model:

level1
<!ATTLIST level2
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

level3

level3 contains sub-subdivisions that nest within <level2> subdivisions (e.g., sub-subsections within subsections). The class attribute identifies the actual name (e.g., section, subpart, subsubsection) of the subordinate structure it marks.
<!ELEMENT level3
Contains:
(h3 | level4 | %block; | %inlineinblock;)+
Expanded:
(h3 | level4 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum)+
>

May occur within the element content model:

level2
<!ATTLIST level3
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

level4

level4 contains further subdivisions that nest within <level3> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks.
<!ELEMENT level4
Contains:
(h4 | level5 | %block; | %inlineinblock;)+
Expanded:
(h4 | level5 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum)+
>

May occur within the element content model:

level3
<!ATTLIST level4
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

level5

level5 contains further subdivisions that nest within <level4> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks.
<!ELEMENT level5
Contains:
(h5 | level6 | %block; | %inlineinblock;)+
Expanded:
(h5 | level6 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum)+
>

May occur within the element content model:

level4
<!ATTLIST level5
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

level6

level6 contains further subdivisions that nest within <level5> subdivisions. The class attribute identifies the actual name of the subordinate structure it marks.
<!ELEMENT level6
Contains:
(h6 | %block; | %inlineinblock;)+
Expanded:
(h6 | p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum)+
>

May occur within the element content model:

level5
<!ATTLIST level6
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

Go to the Element Structural Groupings.


Br, Linenum, Address, and Div Content Models


br

br marks a forced line break. [XHTML11]
<!ELEMENT br
Contains:
EMPTY
>

May occur within the element content models:

address author notice prodnote sidebar line a em strong dfn kbd code samp cite abbr acronym sub sup span bdo sent w q p doctitle docauthor levelhd h1 h2 h3 h4 h5 h6 hd dt dd li lic caption th td
<!ATTLIST br
Attribute Declared Value Default Value
Original:
%coreattrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED


The %coreattrs; only appear, as there is no content to which the more general %attrs; apply.
>

linenum

linenum contains a line number, for example in legal text.
<!ELEMENT linenum
Contains:
(#PCDATA)
>

May occur within the element content models:

address author notice prodnote sidebar line a em strong dfn kbd code samp cite abbr acronym sub sup span bdo sent w q p doctitle docauthor levelhd h1 h2 h3 h4 h5 h6 hd dt dd li lic caption th td
<!ATTLIST linenum
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

address

address contains a location at which a person or agency may be contacted. By use of <line> to contain content of the individual lines, the class attribute can be used to identify the content of that <line>. For example, class values might include: name, address, region (state. province, etc.), country, location code (such as zipcode, provincial code), phone, fax, email, etc. [XHTML11]
<!ELEMENT address
Contains:
(%inline; | line)*
Expanded:
(#PCDATA | em | strong | dfn | code | samp | kbd | cite | abbr | acronym | a | img | imggroup | br | q | sub | sup | span | bdo | linenum | sent | w | pagenum | prodnote | annoref | noteref | line)*
>

May occur within the element content models:

frontmatter bodymatter rearmatter level level1 level2 level3 level4 level5 level6 div prodnote sidebar note annotation blockquote dd li th td
<!ATTLIST address
Attribute Declared Value Default Value
Original:
%attrs;
Expanded:
id ID #IMPLIED
class CDATA #IMPLIED
style CDATA #IMPLIED
title CDATA #IMPLIED
lang NMTOKEN #IMPLIED
xml:lang NMTOKEN #IMPLIED
dir (ltr|rtl) #IMPLIED
smilref CDATA #IMPLIED
showin (xxx|xxp|xlx|xlp|bxx|bxp|blx|blp) #IMPLIED
>

div

div is a generic container for subdivisions of a book. The <level1> ... <level6> hierarchy, or the <level> tag used recursively, should mark the major hierarchical structures of a book, while <div> is used in less formal circumstances or when for production purposes it is desired that a structure should be treated differently. Compare with <span>, which is used in inline settings. [XHTML11]
<!ELEMENT div
Contains:
(%block; | %inlineinblock; | doctitle | docauthor)+
Expanded:
(p | list | dl | div | blockquote | hr | img | imggroup | table | address | line | author | notice | prodnote | sidebar | note | annotation | a | cite | caption | samp | kbd | pagenum | doctitle | docauthor)+
>

May occur within the element content models:

frontmatter bodymatter rearmatter level level1 lev