How to collate a text

From Textop Wiki

Jump to: navigation, search

This is an explanation of how to get started, for participants in the Collation Project pilot.

Contents

Preliminaries

Minimum qualifications

The Collation Project pilot is limited to people who either have a bachelor's degree in philosophy or have done an equivalent amount of study. You don't have to have a Ph.D. You just have to have the ability to do work at approximately the beginning graduate student level in philosophy.

Please work on collating a text if, and only if, you have already read it at least fairly carefully. Ideally, you would have taught it or studied it in depth in a class.

Note: these are qualifications for participation in the pilot project. Qualifications may be different when the project begins in earnest.

Your role in the pilot project

Your role in the pilot project is to help build an impressive outline that will serve as an example to the scholarly community and the world at large of the viability and value of Textop and of the Collation Project in particular. Moreover, your opinions and experience gained through your work on the project will be invaluable when it comes to settling upon official policies. As persons with experience actually doing collation, your opinions will matter. (Right now, there are very few official policies. We're making it up as we go along.)

This is a collaborative community

This is, for reasons explained in the project manifesto, a collaborative community. So, once things are kicked off, we'll be working together and thinking together about how best to pursue this sort of project. We've already started doing that on the Textop mailing lists, but here on this wiki we will be doing a more robust sort of collaboration: we will actually be editing each other's work and constructing a single document, the outline (and its contents), together.

Since this is a collaborative community, you as a participant must feel free to get involved in bold, new, and creative ways, if this pilot project is going to work. You must feel licensed not only to get to work, but to speak your mind about how work should proceed. Be polite, of course, but be bold!

Getting started

Register your involvement

To get involved in the Collation Project pilot, please do three things:

  1. Review the minimum qualifications above.
  2. Join textop-en-phil.
  3. Using your own real name (not a pseudonym), create a user page that links to information about yourself, like this one: Larry Sanger
  4. Add a link to your user page from one of the collated text home pages, such as the one here: Hobbes, Leviathan

If you do these four things, you should be all set. If in addition you want a personal welcome then by all means send the project director, Larry Sanger, an e-mail at larry.sanger [at] dufoundation.org. Otherwise, just join in: we will trust that you meet the minimum qualifications and are otherwise a spiffy individual.

Learn about the wiki software

Consult the User's Guide for information on using the wiki software. See also the MediaWiki FAQ. Here on this wiki feel free to use the sandbox to make experiments to pages.

Choose a text to work on

Some possible texts are listed here and here. Since we will want to chunk works in their original languages, and since we are set up organizationally to work in English at the present time (we have grand plans for internationalization), please pick a public domain work of philosophy originally written in English.

To reiterate a few points:

  • Please work on collating a text if, and only if, you have already read it at least fairly carefully. Ideally, you would have taught it or studied it in depth in a class. You would have at least read it and understood it reasonably well.
  • Once you have decided on a text (or texts), please put a link to your user page on the text's home page.

Please don't chunk a text if you have not put your name on the text's home page. Discussion about our use of the text will take place either on that page or on the attached discussion page.

Upload the text

We will need to upload the texts to the wiki, because it is the only really straightforward way to mark where we are in our collation effort for any give text, and to make it as easy as possible to copy wiki-ready parts of the text.

If no one has uploaded the text yet, you can do that yourself. (Alternatively, you could request that someone else do it.) To upload the text, try this:

  1. Locate a digital copy of the text. (At this point, we don't really care about the provenance of the copy, as long as it appears to be from a reasonably credible source. Project Gutenberg will do for present purposes.)
  2. Separate the text into manageable parts--five or ten.
  3. Create a separate wiki page for each part of the text. Paste the text onto the page. Make sure that all paragraphs are numbered; restart numbering at each newly-labelled section. (You might have to use some tool, such as a word processor, to massage the text so that it looks all right in the wiki; but texts from Project Gutenberg should work fine with minimal editing. Note that the numbered list feature of word processing programs can automatically insert paragraph numbers.)

There is a video explaining how to do this on tutorial videos.

Chunking and marking up the text

To chunk a text is to divide it into approximately paragraph-sized "chunks." If you just don't have any clue what that means, please first consult the Collation Project summary, the glossary, and the pages linked from The Outline before reading the following. Marking up a text involves (at least) saying what its "linguistic function" is (definition, argument, explanation, example, etc.); summarizing it in a sentence; and giving the source.

Important note: all the text from every collated work should appear at least once somewhere in the outline. No sentences should be omitted. This will present a few small problems, but nothing too difficult.

What constitutes a chunk?

A chunk is often approximately what is usually made into a single paragraph. A chunk corresponds to an approximately "paragraph-sized" point. For instance, in the first paragraph of Hobbes' Leviathan, found here, while he is making various points, Hobbes appears to be making one main point, namely, to argue for and illustrate the proposition that the commonwealth or State is like a living being. Just bear in mind that some authors are very bad at paragraphing, and tend to make several points per paragraph; others as it were artificially separate a single point into several paragraphs.

Incidentally, but importantly, the components of a chunk are to be whole sentences, not parts of sentences. (This will require fixing the initial work on Hobbes.) Easy internationalization is the reason for this: only at this level of granularity can we be sure that there will be a one-to-one matching between an original text and a translation of the text.

A chunk must not be made too long. Ultimately this has to come down to simple word count. More than 400-500 words is too long. But a single short sentence could be a chunk, given that the surrounding sentences do not directly contribute to the point that is made by the sentence.

Occasionally, we will want to prise out parts of paragraphs because they are of independent interest all on their own. For example, in the first paragraph of Leviathan, there is a definition of "nature" that is of some interest in metaphysics--which is not the subject of the rest of the paragraph. So that first sentence can be made into its own chunk (currently filed under Metaphysics > Nature), which overlaps with another chunk (the whole paragraph, which is currently filed under Political Philosophy > The State).

Sometimes a second or third paragraph really is a continuation of a point made in a first paragraph. For example, in Leviathan Ch. XXVII, pars. 40-52, Hobbes gives a list, one item per very short paragraph, illustrating the claim that the more harmful the consequences of a crime are, the more severe the offense. Currently those one-sentence paragraphs are all grouped together in a single chunk. (Future development of the outline might result in them being broken apart.)

Marking up a chunk

Once you have settled on a chunk, you'll have to add two pieces of "metadata": the linguistic function and a summary. Actually, in the process of settling on a chunk, you might notice that you already have a linguistic function and summary in mind: the main idea of a chunk is, after all, what you're going to put in the summary, and function is what the chunk does or accomplishes.

Note that the precise same sentences, or overlapping sets, can make two broadly related but distinct points, and placed in different parts of the outline. If you are inclined to place a chunk into more than three places, though, you probably have not quite grasped what we're up to; please get help from someone (you can always try Larry Sanger). But the summary might very well be different for a chunk placed in a different part of the outline.

Do mark up the text on the text pages (e.g., don't mark up the wiki texts you find linked from Hobbes, Leviathan). This will allow others to see and review your work. It is probably too much work to keep track of under what headings you file chunks (particularly because that is highly changeable). Use the texts linked from Hobbes, Leviathan as a template. Generally speaking, for each section of a work, you'll want to keep a full unedited copy of the text (indented) for future reference, and then the chunks you've made out of that section.

Example

Here is an example of a marked-up chunk (with the three types of metadata marked in parentheses):

Argument (linguistic function)
One cannot have a will to do something and then not do it. (summary)
And though we say in common discourse, a man had a will once to do a thing, that nevertheless he forbore to do; yet that is properly but an inclination, which makes no action voluntary; because the action depends not of it, but of the last inclination, or appetite. For if the intervenient appetites make any action voluntary, then by the same reason all intervenient aversions should make the same action involuntary; and so one and the same action should be both voluntary and involuntary.
Hobbes, Lev VI 53 (source)


Linguistic functions

Each chunk should be assigned one linguistic function.

(Note that not all the chunks in the original outline of Leviathan adhered to this--there are several that are dual labelled, such as "Definition and argument" or "Proposition and example." In cases where one might be inclined to give two labels in this way, either there should be two separate chunks, or one of the labels should be dropped as unnecessary.)

Here we'll keep a "canonical" list of linguistic functions, together with descriptions and examples:

  • definition: text that states and elaborates the meaning of a word or concept. E.g., "A punishment is an evil inflicted by public authority on him that hath done or omitted that which is judged by the same authority to be a transgression of the law, to the end that the will of men may thereby the better be disposed to obedience." (Hobbes, Lev XXVIII 1)
  • distinction: an explanation of the difference in meaning or content between two words or concepts. E.g., "Therefore between counsel and command, one great difference is that command is directed to a man's own benefit, and counsel to the benefit of another man. And from this ariseth another difference... [Hobbes goes on to list two more.]" (Hobbes, Lev XXV 3)
  • typology: a brief summary of the differences in meaning or content between three or more words or concepts; note, a long series of definitions can be prised apart as separate definitions, but if the author is concerned to draw contrasts, or especially to lay out logical or semantic relations in a very short space, the whole should be bundled together and labelled a typology. E.g., "Pride subjecteth a man to anger, the excess whereof is the madness called rage, and fury. And thus it comes to pass that excessive desire of revenge, when it becomes habitual, hurteth the organs, and becomes rage: that excessive love, with jealousy, becomes also rage: excessive opinion of a man's own self, for divine inspiration, for wisdom, learning, form, and the like, becomes distraction and giddiness: the same, joined with envy, rage: vehement opinion of the truth of anything, contradicted by others, rage." (Hobbes, Lev VIII 19)
  • proposition: bald (unargued-for) claims can be labelled "propositions" even if they are briefly elaborated and there are extremely weak or trivial arguments attached. E.g., "The vainglory which consisteth in the feigning or supposing of abilities in ourselves, which we know are not, is most incident to young men, and nourished by the histories or fictions of gallant persons; and is corrected oftentimes by age and employment." (Hobbes, Lev VI 41)
  • argument: a proposition supported by reasons significantly different from the proposition; to be counted as an argument, it should not be an argument merely in form (not everything that uses "therefore" or "because" is an argument) or obviously trivial. E.g., "When a man, upon the hearing of any speech, hath those thoughts which the words of that speech, and their connexion, were ordained and constituted to signify, then he is said to understand it: understanding being nothing else but conception caused by speech. … And therefore of absurd and false affirmations, in case they be universal, there can be no understanding; though many think they understand then, when they do but repeat the words softly, or con them in their mind." (Hobbes, Lev IV 22)
  • explanation: generally, a causal explanation of a type of phenomenon; causal principles or laws would be included here. E.g., "Eloquence, with flattery, disposeth men to confide in them that have it; because the former is seeming wisdom, the latter seeming kindness. Add to them military reputation and it disposeth men to adhere and subject themselves to those men that have them. The two former, having given them caution against danger from him, the latter gives them caution against danger from others." (Hobbes, Lev XI 16)
  • description: a simple description of a phenomenon without an attempt to explain it. "Sometimes a man knows a place determinate, within the compass whereof he is to seek; and then his thoughts run over all the parts thereof in the same manner as one would sweep a room to find a jewel; or as a spaniel ranges the field till he find a scent; or as a man should run over the alphabet to start a rhyme." (Hobbes, Lev III 6)
  • example: an illustration or a series of them of a point; includes extended cases; bear in mind that if an example is used to argue for a general proposition, rather than just to illustrate something that seems to be taken for granted, then it should be called an argument. E.g., "For example sake, it is against the law of nature to punish the innocent; and innocent is he that acquitteth himself judicially and is acknowledged for innocent by the judge. Put the case now that a man is accused of a capital crime, and seeing the power and malice of some enemy, and the frequent corruption and partiality of judges, runneth away..." (Hobbes, Lev XXVI 24)

We will add to this list as needed; and with further discussion we may consolidate items or relabel them.

Chunk summary

A succinct, one-sentence summary of the chunk's contents is all that is required. Clearly, a great deal of nuance will be left out of such summaries. In general, it is better to use the source's jargon, unless some clearer but roughly equivalent way can be found to express the point.

Source

After each quotation, the source should be given after this general format: author, title, place in text. Sources should be cited in abbreviated format according to whatever pattern scholars have developed for citing that source in their scholarly work.

Further rules

More detailed rules on chunking can be found at chunking rules.

Editing the outline and placing chunks

Theoretical background

How to get started with this sort of outline

The current top-level nodes were more or less arbitrarily chosen based on some common "subdisciplines" of philosophy. (But this outline will eventually "morph" into one that will serve many different disciplines. Consequently, the top-level nodes may have to be renamed, and in any event they will certainly have to be rethought.)

Then, underneath the top-level nodes, new subnodes were created to accommodate a new text. For example, Hobbes begins Leviathan with a definition of "nature." The topic of nature (in the sense Hobbes defined it) is generally metaphysical; but, without elaborating an outline a priori, little more than that can or should be said to locate the chunk. So we created a "Nature" subhead under the "Metaphysics" top-level head, and placed Hobbes' definition of "nature" under the "Nature" subhead. But this is little better than a placeholder. It's probable that, when we get more texts to place under the Metaphysics heading, we will have elaborated a structure that indicates that definitions of "nature" properly belong some number of levels down, so to speak.

More generally, we regard outline construction as a bottom-up and highly iterative process. We make well-informed but essentially arbitrary choices about the higher-level nodes like Metaphysics and Ethics. As we collate more and more detailed discussions about specific topics, we fill out the deep structure of (part of) the outline; this then will allow us to revisit the higher-level structure to see whether it properly accommodates the details.

Outline construction is text-driven

The central guiding rule of outline creation is: headings are created only to provide an appropriate place for a chunk. The outline must not be expanded unduly without chunks to file under the new nodes.

The reason for this rule is that the Collation Project is not in the business of creating ontologies a priori, but rather of mapping "the lay of the dialectical landscape." We dare not say its requirements are "objective," but it is a clear enough activity, driven first and foremost by the texts we choose and by rules we will adopt for handling texts, that many people will be able to collaborate and arrive at some sensible compromise positions.

One exception to the rule that comes to mind: suppose a new node is needed for a new kind of text, and the node will certainly live not directly under a general node that now exists, but under a (new) subnode that does not yet exist. In other word, the new text needs be filed under a "grandchild" of an existing node. Then the child may be created in order to place the grandchild node.

Part of the outline is ordered according to semantic reducibility

Generally speaking, if it is possible to speak about Topic A without speaking Topic B, then we will say that A is semantically prior to B. Thus, for example, it is possible to talk about being in general without talking about any particular type of being; it is possible to talk about life in general without talking about human life; it is possible to talk about the mind in general without talking about specific mental functions; and so forth.

The difficulty, however, is that of course we cannot agree on this sort of semantic or ontological reducibility. If we could, then a lot of philosophical problems would be solved. Consequently, we will have to make some controversial and perhaps arbitrary-sounding decisions about where certain topics are to be placed.

For instance, for many religious thinkers, God is an irreducible, fundamental type of being, and thus naturally belongs "near the top" of the outline, for instance as a subhead directly under "Being" (or "Metaphysics"). For others, however, who believe that nothing deserving the name "God" even exists, the topic of God is best regarded as an outgrowth of certain psychological or sociological conditions.

We will have the largest buy-in from potential contributors, and we will also have what appears to many observers to be a maximally coherent conceptual structure if we place items as authors would intend them to be placed in the outline. Hence, Aquinas' discussions of God would belong under some high-level metaphysical headings, while Freud's would belong further down, under some psychological headings.

Note that the top-level nodes are themselves arranged in order of semantic reducibility. (You can talk about Metaphysics without talking about Ethics, but you can't talk about Ethics without talking about the sort of things that are mentioned in Metaphysics.) So nodes of the outline are generally to be arranged in order of semantic reducibility both top to bottom and left to right. You might imagine all the top-level nodes after the first ones being transcluded into the first.

Another part of the outline will be ordered by time and place

When we begin to try to accommodate history texts, we will face a new kind of outlining problem. Clearly, it will be necessary to order certain chunks by time and place. We will revisit the issues involved with this, and expand the current section, when we begin working more with history texts.

Some practical guidelines

The Outline is not hard-wired. It never will be. It is meant to be edited.

If you are qualified to work on the pilot project at all, then do feel free to edit the outline when you are chunking texts. The project simply will not advance if you do not feel emboldened to do this. By no means is it perfect now, and it won't be in good shape for quite some time to come.

General rules

The above discussion suggests a few general rules:

  • Create a new heading only to provide an appropriate place for a chunk. You might make a subnode to place a subsubnode, but don't get more ambitious than that.
  • Don't agonize too much about how to improve the structure of the higher-level nodes. Focus your work on filling out more the structure of more specialized nodes (an example can be seen under the "Law" heading of The Outline--since that's what Hobbes wrote so much about).
  • In deciding where to place brand new sorts of nodes, ask yourself whether it is possible in general to speak about the topic without talking about another topic. You can't talk about ethics without talking about human beings, so ethical topics are going to be filed somewhere below "Human Nature."

Other rules for outlining may be found in outlining rules.

Placing a chunk under an existing node

Sometimes all that you need to do, to accommodate a chunk, is to paste it under a an already-existing node. Bear in mind, however:

  • The chunk must really directly concern the subject described by the node's heading.
  • If there are more than four or five chunks under a node, once you add yours, look over the chunks and see if it would not perhaps make sense to divide the node into subnodes. Do not place more than, say, 10 chunks under a single node.

Creating a new node under an existing node

Sometimes what you need to do, to accommodate a text, is simply create a new node under an existing node. That's very straightforward. The main test for determining whether a new node belongs under an existing node is that child nodes should in some essential way be about the topic of the parent node. In time we will probably work out a list of canonical relations that can exist between parent nodes and children nodes, but currently we have no such list.

For instance, suppose I come across a text that gives a correspondence theory of truth, and that the node "Truth" (but nothing more detailed) now exists. Then it is simply a matter of creating a new node titled "Correspondence theory of truth."

Annotating nodes

At some point--when we are relatively sure that we will want to keep a certain heading--we will want to begin defining what we mean by a certain heading such as "exact science" or "legal personhood," in order to give future users help in deciding whether a chunk belongs under that heading or not. The place to put such annotations is on the node pages themselves, at the top of the page (before any chunks), as you will find on the Religion page.

Renaming a node

People will frequently produce a name for a node off the top of their heads, without racking their brains for exactly the right one. So you should feel free to reword a heading if you think of something more felicitous.

If you do rename a node, however, you must also:

  1. Move the chunks and any other content from the old page to the new page. Also, do a redirect from the old page to the new page. (The "move" function, if it is available to you, will do these two things for you automatically.)
  2. Check to make sure that you have not changed the meaning, even subtly; and in that case, check to see if the chunks filed under the node are all still appropriately placed. Please do address yourself to this problem. This might require placing the now-misfit chunks under another node, or under a brand new node. Bear in mind, that might be the best thing to do.

Repositioning a node

Feel free to reposition (move up or down) nodes particularly in accordance with the business about semantic reducibility.

Editing others' work

Textop in general is and will be a collaborative project. While editors will be responsible for making final decisions in case of dispute (for purposes of this pilot project, decisions can be imparted on discussion pages, viewable under the "discussion" tab at the top of the page), collaborators need not ask their permission to make most edits. In fact, many types of edits are very welcome.

Types of welcome edits

Chunking

  • Consolidating sequential chunks that really are (on reanalysis) making a single point.
  • Splitting one chunk into two, when separate points are made sequentially, or when a chunk is just too long. (If two main points are made by the same sentences, or by overlapping sentences, then there should be two chunks that use two copies of the overlapping sentences.)
  • Reinserting elided text where an ellipsis is marked. (Also make sure that whole sentences are added, not just parts of sentences.)
  • Making an ellipsis to remove text irrelevant to a point. (Just make sure that the elided text does appear somewhere in the outline: the rule is that all the text from every collated work appears at least once somewhere in the outline. Also make sure that whole sentences are removed.)
  • Adding new chunks to ensure that each sentence from the original appears in the outline at least once.

Mark up

  • Changing the linguistic function. Sometimes an argument is misidentified as a proposition or as an explanation. These might require debate and the result will be a more refined idea on everyone's part what these categories ought to be and how they are defined.
  • Expanding a too-brief summary. (Just not more than one fairly simple sentence, please. Summaries are not meant to be perfect--just to express the gist roughly. People can read the whole chunk if they want; chunks shouldn't be too long.)
  • Fine-tuning a summary so that it more closely matches the main point of a chunk.
  • Editing the source information so that it follows a format. (When Textop's software is written, this will be generated automatically.)

Filing chunks

  • You might explore through some existing outline nodes and check that all of the chunks that live at the node are really appropriately placed there.

Outline editing

See the outline editing guidelines above.

The outline is the main collaborative product of the community working on the Collation Project. We must, therefore, be "on the same page" with respect to the principles of outlining. But bear in mind that, at this early stage, we are making it up as we go along, in more ways than one. Please either discuss major or interesting changes on textop-en-phil or on Talk:The Outline.

Type of edits not so welcome

Personal tools