When a sentence is parsed and given a semantic interpretation, the relationship between this interpretation and the information previously expressed in the text as well as the interpreter's general knowledge must be established. Establishing this relationship comes under the general heading of pragmatic interpretation. The particular problems that are solved during this step include
TACITUS interprets a sentence pragmatically by proving that its logical form follows from general knowledge and the preceding text, allowing a minimal set of assumptions to be made. In addition, it is assumed that the set of events, abstract entities, and physical objects mentioned in the text is to be consistently minimized. The best set of assumptions necessary to find such a proof can be regarded as an explanation of its truth, and constitutes the implicit information required to produce the interpretation (Hobbs et al., 1990). The minimization of objects and events leads to anaphora resolution by assuming that objects that share properties are identical, when it is consistent to do so.
In the MUC-3 domain, explaining a text involves viewing it as an instance of one of a number of explanatory schemas representing terrorist incidents of various types (e.g. bombing, arson, assassination) or one of several event types that are similar to terrorist incidents, but explicitly excluded by the task requirements (e.g. an exchange of fire between military groups of opposing factions). This means that assumptions that fit into incident schemas are preferred to assumptions that do not, and the schema that ties together the most assumptions is the best explanation.
In this text interpretation task, the domain knowledge performs two primary functions:
It is clear that much domain knowledge may be required to perform these functions successfully, but it is not necessarily the case that more knowledge is always better. If axioms are incrementally added to the system to cover cases not accounted for in the existing domain theory, it is possible that they can interact with the existing knowledge in such a way that the reasoning process becomes computationally intractable, and the unhappy result would be failure to find an interpretation in cases in which the correct interpretation is entailed by the system's knowledge. In a domain as broad and diffuse as the terrorist domain, it is often impossible to guarantee by inspection that a domain theory is not subject to such combinatorial problems.
The goal of robustness in interpretation therefore requires one to address two problems: a system must permit a graceful degradation of performance in those cases in which knowledge is incomplete, and it must extract as much information as it can in the face of a possible combinatorial explosion.
The general approach of abductive text interpretation addresses the first problem through the notion of a ``best interpretation.'' The best explanation, given incomplete domain knowledge, can succeed at relating some propositions contained in the text to the explanatory schemas, but may not succeed for all propositions. The combinatorial problems are addressed through a particular search strategy for abductive reasoning described as incremental refinement of minimal information proofs.
The abductive proof procedure as employed by TACITUS (Stickel, 1988) will always be able to find some interpretation of the text. In the worst case--the absence of any commonsense knowledge that would be relevant to the interpretation of a sentence--the explanation offered would be found by assuming each of the literals to be proved. Such a proof is called a ``minimal information proof'' because no schema recognition or explication of implicit relationships takes place. However, the more knowledge the system has, the more implicit information can be recovered.
Because a minimal information proof is always available for any sentence of the text that is internally consistent, it provides a starting point for incremental refinement of explanations that can be obtained at next to no cost. TACITUS explores the space of abductive proofs by finding incrementally better explanations for each of the constituent literals. A search strategy is adopted that finds successive explanations, each of which is better than the minimal information proof. This process can be halted at any time in a state that will provide at least some intermediate results that are useful for subsequent interpretation and template filling.
Consider again Message 100 from the MUC-3 development corpus:
A cargo train running from Lima to Lorohia was derailed before dawn today after hitting a dynamite charge.
Inspector Eulogio Flores died in the explosion.
The police reported that the incident took place past midnight in the Carahuaichi-Jaurin area.
The correct interpretation of this text requires recovering certain implicit information that relies on commonsense knowledge. The compound nominal phrase ``dynamite charge'' must be interpreted as ``charge composed of dynamite.'' The interpretation requires knowing that dynamite is a substance, that substances can be related via compound nominal relations to objects composed of those substances, that things composed of dynamite are bombs, that hitting bombs causes them to explode, that exploding causes damage, that derailing is a type of damage, and that planting a bomb is a terrorist act. The system's commonsense knowledge base must be rich enough to derive each of these conclusions if it is to recognize the event described as a terrorist act, since all derailings are not the result of bombings. This example underscores the need for fairly extensive world knowledge in the comprehension of text. If the knowledge is missing, the correct interpretation cannot be found. (A few simple heuristics can capture some of the information, but at the expense of accuracy.)
However, if there is missing knowledge, all is not necessarily lost. If, for example, the knowledge was missing that hitting a bomb causes it to explode, the system could still hypothesize the relationship between the charge and the dynamite to reason that a bomb was placed. When processing the next sentence, the system may have trouble figuring out the time and place of Flores's death if it can't associate the explosion with hitting the bomb. However, if the second sentence were ``The Shining Path claimed that their guerrillas had planted the bomb,'' the partial information would be sufficient to allow ``bomb'' to be resolved to dynamite charge, thereby connecting the event described in the first sentence with the event described in the second.
It is difficult to evaluate the pragmatic interpretation component individually, since to a great extent its success depends on the adequacy of the syntactic analysis it operates on. However, in examining the first 20 messages of the MUC-3 test set in detail, we attempted to pinpoint the reason for each missing or incorrect entry in the required templates.
There were 269 such mistakes, due to problems in 41 sentences. Of these, 124 are attributable to pragmatic interpretation. We have classified their causes into a number of categories, and the results are as follows.
| Reason | Mistakes |
| Simple Axiom Missing | 49 |
| Combinatorics | 28 |
| Unconstrained Identity Assumptions | 25 |
| Complex Axioms or Theory Missing | 14 |
| Underconstrained Axiom | 8 |
An example of a missing simple axiom is that ``bishop'' is a profession. An example of a missing complex theory is one that assigns a default causality relationship to events that are simultaneous at the granularity reported in the text. An underconstrained axiom is one that allows, for example, ``damage to the economy'' to be taken as a terrorist incident. Unconstrained identity assumptions result from the knowledge base's inability to rule out identity of two different objects with similar properties, thus leading to incorrect anaphora resolution. ``Combinatorics'' simply means that the theorem-prover timed out, and the minimal-information proof strategy was invoked to obtain a partial interpretation.
It is difficult to evaluate the precise impact of the robustness strategies outlined here. The robustness is an inherent feature of the overall approach, and we did not have a non-robust control to test it against. However, the implementation of the minimal information proof search strategy virtually eliminated all of our complete failures due to lack of computational resources, and cut the error rate attributable to this cause roughly in half.