Practical experience with natural language generation systems has repeatedly found it to be difficult to link a sentence generator to a host system whose primary task is not generation--an expert system or a database, for example. This is because the representations best suited to NLG usually differ greatly from those most natural for such other tasks.
But there are a number of applications--the production of information brochures, certain types of business letters, and some types of monthly reports, for example--in which the desired form and content is known beforehand, and the reason for using a generator is to produce textual variations. In such cases, one can create all possible inputs a priori and simply select and combine them as output. Unfortunately, when the individual variations become small and the number of possible combinations becomes large, the text produced by mere selection tends to be choppy at best and incoherent at worst.
The solution is a paradigm that has been called `generation by selection and repair'. We are using it as the basis of the HealthDoc project, whose goal is the production of health-education materials customized to the individual patient. HealthDoc,funded by the Government of Ontario, is a collaboration between the University of Waterloo, University of Toronto, and USC/ISI. Each site is building a part of the HealthDoc system.
ISI's portion of the work is the construction of a so-called sentence planner, the engine responsible for performing the repair once the basic textual variations have been selected for output. Sentence planners are employed as a relatively new phase between the traditional phases of text planning (content determination and ordering) and sentence realization (grammatical work). The major modules of the current sentence planner are:
- fine-grain discourse structuring,
- multi-sentence packaging and structuring, incl. aggregation
- internal sentence organization,
- lexical choice,
- anaphor and other reference choice.
Demonstration versions of HealthDoc's modules (the textual Authoring Tool, the Sentence Planner, the overall system, etc.) exist on the web. A patent application was filed by the University of Waterloo for the design of the Master Document, based on research done there. Continuation funding is being sought, and a spinoff company called Inkpot Inc. has been formed.
Eduard Hovy, senior project leader
Visitors and collaborators from Canada; mostly,
Prof. Chrysanne DiMarco (overall project leader; University of Waterloo) and Prof. Graeme Hirst (University of Toronto)
NLG overview | Project Members | Projects| Demonstrations | Publications