Plain text documents. Not applicable to technologies that contain markup.
The objective of this technique is to recognize a paragraph in a plain text document. A paragraph is a coherent block of text, such as a group of related sentences that develop a single topic or a coherent part of a larger topic.
The beginning of a paragraph is indicated by
The end of a paragraph is indicated by
A blank line contains zero or more non-printing characters, such as space or tab, followed by a new line.
Two paragraphs. Each starts and ends with a blank line.
This is the first sentence in this
paragraph. Paragraphs may be long
or short.
In this paragraph the first line is
indented. Indented and non-indented
sentences are allowed. White space within
the paragraph lines is ignored in
defining paragraphs. Only completely blank
lines are significant.
For each paragraph: