The interpretation of discourse is a very big problem. If it is to be accessible to inquiry, we must break it into smaller subproblems, ``carve nature at its joints,'' as the saying goes. But the saying provides a good illustration of a danger. Medicine, in carving nature at its joints, does not carve the body at its joints, with specialists in the lower right leg and the left index finger. It carves the body into coherent systems. The best way to carve the subject matter of a discipline is rarely obvious. In discourse theory one could, for example, concentrate on such ``subproblems'' as syntactic ambiguity, pronouns, or compound nominals, or one could confine one's inquiry to such domains as classroom discourse, telephone conversations, or children's stories. But these approaches would be the equivalent of concentrating on the left index finger. All the problems of discourse arise in each of these ``simplifications'', and a large-scale research program organized along such lines would result in massive duplications.
This book, and the framework informing it, is organized along quite different lines. Our goal is to present a theory of how world knowledge is brought to bear on the interpretation of discourse. Therefore, we first need a logical notation for expressing this knowledge.1.14 Chapter 2 presents the logical notation that will be used in this book. There are of course numerous, quite serious problems in representing natural language concepts in logic, including time, modality, adverbials, belief and other intensional contexts, and quantification and plurality. I do not pretend to have solved all of these problems. Rather, my goal has been to bypass them. This is done by means of an approach that might be called ``ontological promiscuity''. One assumes that anything that can be talked about exists in some Platonic universe. One consequence of this is that model theory, the model theoretic semantics of the logical notation, will do virtually no work for us. All of the particulars of the meanings of various concepts will have to be encoded explicitly in the knowledge base, essentially pushing the problems from Chapter 2 to Chapter 5. On the other hand, having a uniform and relatively simple representation for all content encoded in natural language greatly simplifies the task of describing the procedures that manipulate these representations to arrive at interpretations.
Chapter 3 describes the ``Interpretation as Abduction'' framework that informs the rest of the book. The fundamental idea is that intelligent agents interpret their environment by finding the best explanation for the observables in it. Correspondingly, they interpret texts by finding the best explanation for the explicit content, the ``observables'', of the text. This picture is then subsumed under a picture in which intelligent agents interpret their environment, and texts, by finding the best explanation for why the environment, or the text, is coherent. A method of ``weighted abduction'' is described that has the right properties for finding a best explanation. It is indicated how this framework subsumes a broad range of problems in discourse interpretation; most of the remainder of the book expands on this. It is then shown how weighted abduction can be realized in a structured neural net, thereby bringing us closer to linking up intelligent behavior with neurophysiology. Finally there is a discussion of an incremental account of learning new axioms, and how this could be realized in the structured neural net model.
If to interpret a sentence is to prove abductively its propositional content, there must be a way of mapping between the string of words and the logical representation, in the notation provided in Chapter 2, of the propositional content. Syntax provides this mapping. Chapter 4 treats syntax as an elaboration of the ways in which adjacency of segments of discourse can be interpreted as predicate-argument relations. The aim in Chapter 4 is to cover all the syntactic phenomena that occur in the target texts. The result is a rather substantial subset of English grammar, modelled loosely on Pollard and Sag's Head-Driven Phrase Structure Grammar. Moreover because the target texts exhibit such ``performance'' phenomena as ungrammaticalities, scrambling, disfluencies, and co-constructions, these are dealt with as well. An account of is given of how a competence grammar can be deployed in performance. The chapter closes with some remarks on modularity, and a plausible, incremental account of how syntax could have evolved.
People understand language so well because they know so much. Thus, the two major tasks in developing a theory of discourse interpretation are to specify the procedures that use knowledge to interpret discourse and to encode a sizable chunk of that knowledge. Much of the rest of the book is about the first of these tasks. Chapter 5 concerns the second. In Chapter 5 an effort is made to specify all the knowledge that is required in the understanding of all the target texts. However, the aim has been to do this by means of a systematic methodology that shows the way for extending such a knowledge base well beyond what is presented here. The knowledge is encoded at as general a level as possible, and in fact the diversity of the target texts is intended to force just this. The chapter taps into but elaborates significantly on a long tradition in lexical semantics. The most important domains that are axiomatized are abstract domains that underlie virtually every text. These include granularity, systems and the figure-ground relation, scales, change of state, space and time, causality, mental models, and goal-directed behavior. The more specific, concrete domains are built on top of these general, abstract domains, and their role in this book is primarily illustrative of how specific domains should be axiomatized.
The amount of world knowledge required for interpreting discourse is of course immense. Some take this as an argument that discourse theory is a hopeless enterprise. I do not agree. A basic premise of this work is that one can separate the knowledge that is represented and the processes that use that knowledge, and study each in isolation. To investigate the latter we do not need to know all the knowledge that is to be represented; we only need to see a representative, moderately large sampling, so that we have good intuitions as to how the knowledge would be represented and we begin so see the problems that arise in scaling up. Nor am I hopeless about the prospects of encoding vast amounts of world knowledge. Unabridged dictionaries and large encyclopedias get written, and the task of building the required knowledge base is probably not larger in scale than these efforts. My belief is that the first 10,000 axioms have to be developed with care. They will provide enough models that the next 100,000 axioms can be developed much more easily. We will then be in a position perhaps to learn the next 1,000,000 axioms automatically. My aim in this chapter is to make a serious start on that first 10,000 axioms.
Every text presents a rich set of discourse problems, as we saw in Section 1.1. The argument of Chapter 3 is that the solutions to these problems simply fall out of the process of finding the explanation for the text, both its occurrence and its content. Chapters 6, 7, and 8 are explorations of this thesis. A broad range of discourse problems is examined in detail, and it is shown how solutions to many of the difficulties they raise are simply subsumed under the general process of abduction, or how they place constraints on its operation.
What are the discourse problems that can be posed by a text? We may divide this question into three parts--the problems that arise within single sentences (whether or not they can be solved within this narrow perspective), problems that arise when a sentence is embedded within a larger discourse, and problems that arise when the discourse has to be related to the surrounding environment. Chapter 6 examines the first class of problems. The basic building blocks of sentences are predications, by which is meant a predicate applied to one or more arguments. This implies three problems. First, what does the argument refer to? This is the coreference problem, and other problems, such as many syntactic ambiguities, can be seen as variants of it. Second, what is the predicate that is being conveyed? This includes resolving word sense ambiguities. But more generally, texts give us rather general explicit information, and we need to determine more specific interpretations. Extreme cases of this can be seen in phenomena such as compound nominals and denominal verbs, where the predicate is only implicit and must be ``vivified''. Third, in what way are the predicate and its arguments congruent? At the simplest level, this amounts to the checking of selectional constraints. But the issue also arises of how to interpret the predication when there is no apparent congruence. There are two deformations one can make to force an interpretation. One can assume that the argument refers not to the explicit referent but to something functionally related to the explicit referent. This is metonymy, and the process of interpreting metonymy is known as ``coercion''. Or one can assume that the predicate does not quite mean what it literally means, that is, that certain inferences normally associated with the predicate cannot be drawn in this instance. An important example of this is metaphor. These two modes of interpretation and constraints on them are examined in Chapter 6 as well.
The second class of discourse problems concerns the relation of a sentence or larger segment of text to the rest of the text of which it is a part. This is the problem of ``local coherence''. In Chapter 7 this problem is addressed from very much the perspective of Chapter 4 on syntax--where two segments of text are adjacent, what are the possible interpretations of this adjacency. The argument of Chapter 7 is that overwhelmingly adjacency is interpreted as relations provided at the very foundation of the knowledge base in Chapter 5--the figure-ground relation, change of state, causality, and the similarity that allows us to build sets and other systems out of individual entities. Definitions of these coherence relations are given in terms provided by the knowledge base of Chapter 5. These definitions characterize what it is to recognize the relations and thereby recognize the coherence of the discourse. Examples are given of each of the coherence relations. It is shown that recognizing coherence in this way frequently leads to the solution of coreference and other discourse problems as a by-product. It is shown how this notion of coherence allows us to make very precise sense out of some of the classical and elusive concepts of discourse analysis, including ``topic'', ``focus'', ``genre'', and ``story grammar''. This account of coherence in discourse is compared with other accounts.
In Chapter 8 the perspective moves out to the environment in which the discourse occurs, and investigates how the occurrence of a discourse is to be explained as a part of ongoing events in the world. This is the problem of ``global coherence'' and constitutes the third set of problems that a discourse raises. Normally utterances are taken to be intentional actions in the service of a larger plan or plans the participants are engaged in. In this chapter the relation between a discourse and the plans of the participants is investigated.
Most of the book up to this point will have addressed the issue of how to make the correct interpretation of a discourse possible. But in a rich knowledge base, there will generally be many possible interpretations. Chapter 9 begins to discuss how a single ``best'' interpretation can be chosen for the sentence, given the various possible interpretations the theory licenses. There are two parts to this investigation--a theoretical examination and an attempt to draw lessons from practice. In the theoretical part, a framework is developed for describing optimal communication in terms of the probabilities of the predications being conveyed and the utilities of conveying them. It is shown how this unpacks at a finer grain into the scheme of weighted abduction, with a particular regime of assigning and altering weights. Then it is shown how the structured neural net representation of the weighted abduction scheme provides a biologically plausible approximation to the theoretically motivated model at the symbolic level. In the second part of Chapter 9 there is an examination of the problems that arise in interpreting the target texts and other discourse with respect to the knowledge base developed in Chapter 5.
Chapter 10 consists of detailed analyses of the target texts, to illustrate how the theory developed in this book plays out in specific instances. The complete interpretations are shown and it is described how the solutions to the various discourse problems posed by these texts emerge from the analyses.
Finally in Chapter 11, there is a discussion of the role of a formal theory of discourse in other theoretical enterprises that are concerned with discourse, including sociology, microsociology, ethnography, psychology, and literary criticism. The probems of validating hypotheses about textual interpretations and about shared knowledge are discussed here.