Investigate the structure of multisentence discourse, monologues and dialogues, and build text planning (and sentence planning) systems to plan coherent multisentence texts.
The internal structure of discourse and the computational planning and generation of coherent multisentential paragraphs has been a topic of investigation at USC/ISI since the early 1980's. A theory of the interclausal relationships that govern discourse structure, called Rhetorical Structure Theory (RST), was developed in [Mann and Thompson 88] after extensive analysis of hundreds of texts of various genres. The analysis concluded that English text is coherent by virtue of so-called rhetorical relations that hold between clauses and blocks of clauses, and identified about 25 basic relations for English. These relations, such as Sequence, Purpose, and Elaboration are usually identified in English by key words or phrases (such as "then", "in order to", and "e.g.", respectively).
In order to plan multisentence paragraphs by computer, one requires both a sound theory of text organization and an algorithm that can make efficient use of it. The theory is provided by RST; the algorithm by an adaptation of the top-down hierarchical expansion planning system NOAH developed in AI in the 1970's. A text structure planner was developed at USC/ISI to plan coherent paragraphs which achieve communicative goals of affecting the hearer's knowledge in some way [Hovy 88]. The planner operated in conjunction with some application program (such as a database access system or expert system) and employed the Penman generator to generate individual sentences. Using operationalized RST relations and other text plans, the planner constructed a tree that embodied the paragraph structure, in which nonterminal nodes were RST relations and terminal nodes contained the material to be communicated. This text planning process has been extended by several other projects at USC/ISI and elsewhere. In a separate project, members of the EES/EXPECT project at USC/ISI built the EES text planner along the same lines as the initial Penman text structurer, incorporating an expanded text plan library using a notation oriented toward intentionality [Moore 89]. This planner's text plan contains the intentional, attentional, and rhetorical structures of the explanations it generates for EES expert systems.
A later research effort studied the number of interclausal discourse structure relations. [Hovy and Maier 93] collected and taxonomized over 300 relations from a variety of sources into three taxonomies of 120 relations altogether. A second effort performed at USC/ISI involved the automated planning of certain types of text formatting. In [Hovy and Arens 91], the communicative semantics of certain text formatting devices (such as enumerated lists, itemizations, footnotes, appendices, etc.) is described in terms of RST relations, and the automated planning of formatted paragraphs of text is illustrated.
A general overview of the text planning work appears in [Hovy 93].
Current work focuses on two aspects: a better understanding of discourse structure and the construction of various modules of a sentence planner (including clause aggregation, sentence scoping, reference, and some aspects of lexical choice). The sentence planner is being tested in the Spangloss machine translation project.
A volume of papers from contributors to the 1993 NATO workshop Burning Issues in Discourse. The papers were selected to illustrate not only the range of questions under study but also the methodologies employed by various world experts in Text Linguistics (Martin, Hajicová), Logic (Asher and Kamp), Computational Linguistics (Hobbs, Dahlgren, Passoneau and Litman), Phonology (Hirschberg), Linguistics (De Beaugrande), Sociology/Ethnomethodology (Schegloff), and Linguistics (Thompson and Ono).
This paper summarizes work over the past five years on the automated planning and generation of multisentence texts using discourse structure relations, placing it in context of ongoing efforts by Computational Linguists and Linguists to understand the structure of discourse. Based on a series of studies by the author and others, the paper describes how the orientation of generation toward communicative intentions illuminates the central structural role played by intersegment discourse relations. It outlines several facets of discourse structure relations as they are required by and used in text planners -- their nature, number, and extension to associated tasks such as sentence planning and text formatting.
Rhetorical Structure Theory is a descriptive theory of a major aspect of the organization of natural text. It is a linguistically useful method for describing natural texts, characterizing their structure primarily in terms of relations that hold between parts of the text. This paper establishes a new definitional foundation for RST. Definitions are made more systematic and explicit, they introduce a new functional element, and incidentally reflect more experience in text analysis. Along with the definitions, the paper examines three claims and findings of RST: the predominance of nucleus/satellite structural patterns,. the functional basis of hierarchy, and the communicative role of text structure.
This paper discusses the need for and nature of multifunctionality of discourse markers, signalling in parallel several simultaneous structures that underlie coherent discourse. Arguing that any adequate description of discourse requires at least four distinct structural analyses -- semantic, interpersonal/goal-oriented, attentional/thematic, and rhetorical -- one has to address the questions: what are the individual functions of these different structures, how do they interact, and how are they expressed in the text? With respect to the last question, it is clear that when constructing the discourse, the speaker has to select, from the available structuring cues or markers, the one(s) that minimize the overall structural ambiguity for the hearer. When included in the rhetorical structure, and hence in the text, these cues or markers assist the hearer in decoding the speaker's message into the various parallel structures, resulting in effective communication. Three collections of cues (semantic, interpersonal, and rhetorical) are given, and the overloading of meaning for rhetorical cues is discussed.
Over the past ten years, researchers studying the structure of discourse have consistently had to face questions such as the following: Given that discourses consist of segments, how do the segments relate? What intersegment relations are there? How many are needed? A fair amount of controversy exists, ranging from the parsimonious position (that two basic relations suffice) to the profligate position (that an open-ended set of semantic/rhetorical relations is required). This paper outlines the arguments and then summarizes a survey of the conclusions of approximately 30 researchers -- from linguists to computational linguists to philosophers to Artificial Intelligence workers. It fuses and taxonomizes the more than 400 relations they have proposed into a hierarchy of approximately 70 increasingly semantic relations, and argues that though the taxonomy is open-ended in one dimension, it is bounded in the other and therefore does not give rise to anarchy. Some evidence is provided for the organization of the taxonomy, as well as a full listing of the sources.
Very few texts longer than a paragraph are written without appropriate formatting. To ensure readability, automated text generation programs must not only plan and generate their texts but be able to format them appropriately as well. We describe how work on the automated planning of multisentence text and on the display of information in a multimedia system led to the insight that text formatting devices such as footnotes, italicized regions, enumerations, etc., can be planned automatically by a text structure planning process. This is achieved by recognizing that each formatting device fulfills a specific communicative function in a text, and that such functions can be defined in terms of the text structure relations used as plans in a text planning system. An example is presented in which a text is planned from a semantic representation to a final form that includes English sentences and LaTeX formatting commands, intermingled as appropriate.
We describe in this paper a new text planner that has been designed to address several problems we had encountered in previous systems. Motivating factors include a clearer and more explicit separation of the declarative and procedural knowledge used in a text generation system as well as the identification of the distinct types of knowledge necessary to generate coherent discourse, including communicative goals, text types, schemas, discourse structure relations, and theme development patterns. This knowledge is encoded as separate resources and integrated under a flexible planning process that draws from appropriate resources whatever knowledge is needed to construct a discourse structure. We describe the resources and the planning process and illustrate the ideas with an example.