The Text Outline Project: Glossary
Chunks are made up of:
- The text: an entire work, or any part of it; e.g., Hobbes' Leviathan.
- Markup: XML or other markup that identifies (in a machine-readable
form) parts, chapters, paragraphs, sentences, etc., within a text. Later,
perhaps other kinds of markup.
- Chunks: approximately paragraph-sized bits of the text,
distinguished by their function (e.g., definition, argument,
explanation, description, etc.), together with other information about
these bits of text (see below).
- The outline: usually we will use this to mean just the
outline, not the chunks that are placed in the outline. The outline in
this restricted sense consists exclusively of its nodes and
- Nodes: parts or lines of
the outline, or what lives across from a number or bullet point; for example, "The
ends of inquiry" and "The lawfulness of private organizations" are
headings of two nodes; note, however, that nodes are to be distinguished by numerical or other unique identifiers, not by their headings, since the
same node can have different headings in different languages, and headings are
- Headings: the words, in some specific
language, assigned to a node. As the same node can exist in
multiple languages, it can have multiple headings.
Example of a chunk:
- Function: a single-word description of the linguistic function of the chunk; e.g., definition, argument, explanation, description, etc.
- Summary: a summary of the content of the chunk.
words actually taken from the text; often, a paragraph. (A more
technically useful concept might be the language-independent
identifiers of sentences, stated in terms of the markup of the text.)
- Reference: a human-friendly, automatically-generated pointer to where the
sentences may be found in the text. E.g., "Hobbes, Lev XIV 1".
- Chunk metadata: the above components of a chunk. Project participants may decide to add other required metadata fields.
A very small and relatively powerless state is ineffective to remove the
state of nature. [summary]
Nor is it the joining together of a small number of men that gives them
this security; because in small numbers, small additions on the one side or the
other make the advantage of strength so great as is sufficient to carry the
victory, and therefore gives encouragement to an invasion. The multitude
sufficient to confide in for our security is not determined by any certain
number, but by comparison with the enemy we fear; and is then sufficient when
the odds of the enemy is not of so visible and conspicuous moment to determine
the event of war, as to move him to attempt. [the sentences]
Hobbes, Lev XVII 3 [reference]
Finally, we will use a verb, "to chunk," to mean dividing a text up into chunks.
Back to home page