next up previous
Next: Evaluating the System Up: Introduction Previous: Introduction

The TACITUS System

The TACITUS text processing system has been under development at SRI International for the last six years. This system has been designed as a first step toward the realization of a system with very high completeness and accuracy in its ability to extract information from text. The general philosophy underlying the design of this system is that the system, to the maximum extent possible, should not discard any information that might be semantically or pragmatically relevant to a full, correct interpretation. The effect of this design philosophy on the system architecture is manifested in the following characteristics:

These basic design decisions do not by themselves distinguish TACITUS from a number of other natural-language processing systems. However, they are somewhat controversial given the intermediate goal of producing systems that are useful for existing applications. Criticism of the overall design with respect to this goal centers on the following observations:

Designers of application-oriented text processing systems have adopted a number of strategies for dealing with these problems. Such strategies involve de-emphasizing the role of syntactic analysis (Jacobs et al., 1991), producing partial parses with stochastic or heuristic parsers (de Marcken, 1990; Weischedel et al 1991) or resorting to weaker syntactic processing methods such as conceptual or case-frame based parsing (e.g., Schank and Riesbeck, 1981) or template matching techniques (Jackson et al., 1991). A common feature shared by these weaker methods is that they ignore certain information that is present in the text, which could be extracted by a more comprehensive analysis. The information that is ignored may be irrelevant to a particular application, or relevant in only an insignificant handful of cases, and thus we cannot argue that approaches to text processing based on weak or even nonexistent syntactic and semantic analysis are doomed to failure in all cases and are not worthy of further investigation. However, it is not obvious how such methods can scale up to handle fine distinctions in attachment, scoping, and inference, although some recent attempts have been made in this direction (Cardie and Lehnert, 1991).

In the development of TACITUS, we have chosen a design philosophy that assumes that a complete and accurate analysis of the text is being undertaken. In this paper we discuss how issues of robustness are approached from this general design perspective. In particular, we demonstrate that

Our experience with TACITUS suggests that extension of the system's capabilities to higher levels of completeness and accuracy can be achieved through incremental modifications of the system's knowledge, lexicon and grammar, while the robust processing techniques discussed in the following sections make the system usable for intermediate term applications. We have evaluated the success of the various techniques discussed here, and conclude from this evaluation that TACITUS offers substantiation of our claim that a text processing system based on principles of complete syntactic, semantic and pragmatic analysis need not be too brittle or computationally expensive for practical applications.


next up previous
Next: Evaluating the System Up: Introduction Previous: Introduction
Jerry Hobbs 2004-02-24