next up previous
Next: PARSER Up: The Generic Information Extraction Previous: FILTER

PREPARSER

More and more systems recently do not attempt to parse a sentence directly from the string of words to a full parse tree. Certain small-scale structures are very common and can be recognized with high reliability. The Preparsing module recognizes these structures, thereby simplifying the task of the Sentence Parser. Some systems recognize noun groups, that is, noun phrases up through the head noun, at this level, as well as verb groups, or verbs together with their auxilliaries. Appositives can be attached to their head nouns with high reliability, as can genitives, ``of'' prepositional phrases, and perhaps some other prepositional phrases. ``That'' complements are often recognized here, and NP conjunction is sometimes done as a special process at this level.

Sometimes the information found at this level is merely encapsulated and sometimes it is discarded. Age appositives, for example, can be thrown out in many applications.

This module generally recognizes the small-scale structures or phrases by finite-state pattern-matching, sometimes conceptualized as ad hoc heuristics. They are acquired manually.



Jerry Hobbs 2004-02-24