Workshop on Machine Translation Evaluation
in conjunction withNAACL-2001


Hands-On Evaluation

3 or 4 June, 2001
Pittsburgh, PA
United States


Evaluation of language tools, particularly tools that generate language, remains an interesting and general problem. Machine Translation (MT) is a prime example. Approaches to evaluating MT are even more plentiful than approaches to MT itself; the number of evaluations and range of variants is confusing to anyone considering an evaluation. In an effort to systematize MT evaluation, the NSF-funded ISLE project has created a taxonomy of evaluation-related features and measures. Unfortunately, however, many prior evaluations do not include an adequate specification of important aspects such as evaluation process complexity, cost, variance of score, etc.

In an effort to drive MT evaluation to the next level, this workshop will focus on exercising with methods of acquiring such information for several important MT evaluation measures. The workshop thus embodies the challenge of Hands-On Evaluation, within the context of the framework being developed by the ISLE MT Evaluation effort. The workshop follows a workshop on MT Evaluation held at the AMTA Conference in Cuernavaca, Mexico, in October 2000, and a subsequent workshop being planned for April 2001 in Geneva.


The first part of the workshop will introduce the ISLE MT Evaluation effort, funded by NSF and the EU, to create a general framework of characteristics in terms of which MT evaluations, past and future, can be described and classified. The framework, whose antecedents are the JEIDA and EAGLES reports, consists of taxonomies of increasingly specific features, with associated measures and pointers to systems. The discussion will review the current state of the classification effort as well as review the MT evaluation history from which it was drawn.

The second, principal, part of the workshop will focus on real-world evaluation. In an effort to facilitate common ground for discussion, participants will be given specific evaluation exercises, defined by the taxonomy and recent MT evaluation trends. In addition, they will be given a set of texts generated by MT systems and human reference translations. They will be asked, during the workshop, to perform given evaluation exercises with the given data. This common framework will give insights into the evaluation process and useful metrics for driving the development process. The results of the exercises will then be presented by the participants, synthesized into a uniform description of each evaluation, and added to the ISLE taxonomy, which has been made available on the web for future analysis in MT evaluation. The results of the workshop will also be incorporated into a publicly available resource and the workbook from the workshop will be able to be used by teachers of evaluation and MT.


Since this is a hands-on workshop, participants will be asked to submit an intent to participate. At that time, they will be able to download the relevant data for review. During the workshop, they will be given a series of exercises and split into teams for working these exercises. The result of the workshop will be at least one paper which addresses the following threads of investigation within the framework:


Since this is a hands-on workshop, no papers are being solicited. Participants will be expected to take part in the exercises and report their conclusions. They will additionally be encouraged to contribute to a summary paper of the workshop proceedings. The data will be sent to participants in advance of the workshop, with instructions on what to do and what to prepare. The amount of work required should not exceed 4 hours (much less than paper preparation).

To register an intent to participate, please send a paragraph outlining your interest in MT, experience with MT evaluation, knowledge of either Spanish or Arabic, and the following contact information to Flo Reeder (contact info below):

Participants will need to register for the workshop as part of their NAACL registration.


Intent to Participate: April 16, 2001
Release of Data: April 23, 2001
Workshop date: TBD


Florence Reeder
MITRE Corporation
1820 Dolley Madison Blvd.
McLean, VA 22102-3481
TEL: 703-883-7156
FAX: 703-883-1379

Eduard Hovy
Information Sciences Institute
University of Southern California
4676 Admiralty Way
Marina del Rey, CA 90292-6695
TEL: 310-448-8731
FAX: 310-823-6714


ISLE Classification Web-Site:

AMTA-2000 Workshop Proceedings:

This site last modified 04 January 2001.