next up previous
Next: Categories of Uses for Up: ``the'' Previous: ``the''

Introduction

The research described here is first effort to develop a neat formal account of the use of the determiner ``the'' in terms of the framework of ``Interpretation as Abduction'' (IA) (Hobbs et al., 1993). This assumes a style of representation that has been called ``ontological promiscuity'' (Hobbs, 1985), in which events and properties are reified and sets and typical elements of sets are first-class individuals. All morphemes are viewed as conveying a proposition which can be represented in first-order logic, and the logical form of a sentence is a flat conjunction of simple propositions, with roughly one proposition per morpheme.

The principal claim of the paper is that the word ``the'' conveys a relation between an entity referred to by the noun phrase and a description of the entity provided by the noun phrase. The information conveyed by ``the'' is that is mutually identifiable in context by virtue of the property .

In the IA approach the interpretation of a text is the least cost proof of its logical form, allowing assumptions at a cost for propositions that can't be proved. In choosing the least-cost proof we want to favor proof that use axioms that are currently salient, are shorter, maximize redundancy, minimize assumptions, and use the most recent propositions in the previous text or text structure.

For example, consider the text

John bought a new car. The engine is already broken.

The existence of a car is assumed in interpreting the first sentence--. A part of the logical form of the second sentence is the propostion that there is an engine of something . Suppose in our knowledge base we have the fact that cars have engines.

One interpretation is obtained by simply assuming , that is, there is an engine of something. Another interpretation is obtained by backchaining on the axiom and assuming the engine is the engine of some car. This is more expensive because it is longer. A third interpretation is obtained by using the fact and the axiom to prove the existence of the engine. That is, the engine mentioned in the second sentence is the engine of the car mentioned in the first sentence. This proof is slightly longer, but it involves no assumptions, so it is the least-cost proof.

Now consider the text

John bought a new car. I saw the red Honda yesterday.

Here the logical form of the second sentence included . Suppose we have an axiom that says that cars manufactured by Honda Corporation are Hondas.


Then we can find a partial proof of the existence of a red Honda, from the car mentioned in the first sentence. What is lacking in that proof is that that car is red and was manufactured by the Honda Corporation. But we can assume these two propositions, and still have the least-cost proof of the existence of the red Honda, so we do. These assumptions that the hearer makes in order to see the text as coherent are implicatures. It is new information.

It needs to be emphasized that we are seeking the best interpretation of the whole text, not just the definite noun phrases. In the text

Go down Washington Street three blocks.
Turn left.
My house is the third one on the right across the street from the drugstore.

``the street'' does not refer to Washington Street, but to the street you turned left onto in the second sentence. This is because we have to prove not only the existence of a street, but also the existence of relations between the events and properties described in the successive sentences.

What is missing in the above analyses is the information conveyed by the definite determiner. The word ``the'' in the first example conveys the information that the engine can be uniquely mutually identified in context by virtue of its description as an engine. In the second example the can be uniquely mutually identified in context by virtue of its description as a red Honda.

To explicate a notion of ``mutual identifiability'' we need to spell out a core theory of mutual belief. The key features of such a theory would be the following:

  1. If a set of agents mutually believe then the individual agents believe .
  2. If a set of agents mutually believe then they mutually believe they mutually believe .
  3. Agents can do logic inside mutual belief.
  4. An agent's world knowledge is tagged by what groups of agents mutually believe it.
  5. Copresence implies mutual belief in what is co-perceived, so previous discourse is mutually believed.

An agent identifies an entity if knows a property that is true of and of nothing else. Further constraints are generally required on the property in various contexts. An entity is identifiable by by virtue of a property if 's knowing causes to identify . The simplest case is where the property and the property are the same; this is the case of mutually known entities. An entity is mutually identifiable by a group of agents by virtue of a property if it is mutually believed by the agents in that if any of the agents in know , that will cause the agent to identify . To repeat what was said above, the word ``the'' conveys that the entity referred to by the noun phrase is mutually identifiable by virtue of the description provided by the noun phrase.

One way of being identifiable via a description is by being the unique entity of that description. Examples of this include known unique entities (``the world''), entities with a functional relation with another entity, either due to the function (``the top of the table'') or due to the entity (``the engine of the car''), superlatives (``the tallest man in the room''), or sets described by plural noun phrases (``the men in the room'').

More common are cases where the hearer will be able to identify the entity uniquely in the natural course of understanding the discourse. The use of the definite determiner here is an expression of confidence in the hearer. Speakers are always monitoring the hearer's understanding via some folk theory of discourse understanding. The word ``the'' conveys a predicate in that theory.


next up previous
Next: Categories of Uses for Up: ``the'' Previous: ``the''
Jerry Hobbs 2003-08-28