The JV-FASTUS system achieved an error rate per slot fill of 0.70 and richness-normalized error of 0.82 on English joint ventures. The error rate was the third best result reported, and was exceeded only by systems that had significantly longer domain-specific development time than FASTUS. The richness-normalized error was the second best of all systems reported. This error rate corresponds to recall of 34% and precision of 56% and an equally-weighted F-metric of 42.67.
The JV-FASTUS system also achieved an error rate per slot fill of 0.70 and richness-normalized error of 0.79 on Japanese joint ventures. This was also the third-best error rate of all sights reporting, and was exceeded only by systems that had significantly longer development time. This error rate corresponds to recall of 34% and precision of 62% and an equally-weighted F-meteric of 44.21.
After the MUC-4 conference, SRI embarked on an effort to rationalize our MUC-4 system, refine its overall architecture, and to add a user interface to facilitate the definition and maintenance of the finite-state transducers comprising the system. With the exception of a very skeletal system for English joint ventures based on a corpus of about 10 articles extracted by hand from current issues of the Wall Street Journal, which was used as the basis of a demonstration at the ARPA HLT meeting, no domain-specific development was undertaken until the beginning of April. The first end-to-end test of the English joint ventures system on 100 texts was conducted on April 30, 1993 with the result of an unimpressive error rate of 0.93 (F-measure 6.02)
At the same time, we ran the first end-to-end test of the Japanese version of JV-FASTUS. Although we had developed a version of FASTUS called MIMI for information extraction from Japanese spoken dialogues [ref?], and thus had some experience in processing Japanese with FASTUS, the actual base system for MIMI is really the same as the English system, since it operates on a Romaji encoding of speech, rather than on Kanji characters. [Megumi to put some results in here].
We succeeded in raising the system performance from this baseline to our reported results in 3 months of work. We feel that this experience confirms the adequacy of the tools provided by JV-FASTUS for the rapid development of information extraction systems in new domains. We stopped development of the system at approximately 3:00 PM on August 1. At that time, our improvement curve was still extremely steep. Work during of the morning of August 1 resulted in a 0.5 point improvement in F-measure. We had not even attempted to produce revenue objects, and our treatment of times and facilities was extremely sketchy. We feel that another week or two of development would have led to significantly improved results.