Metric: The developer should provide a description of the theory and method of translation used by the system. -- Method: The developer should provide white-papers and supporting documentation. -- Measurement: Confirmation of method by study of documentation.
Metric: Minimum size of the training corpus -- Method: Specification by developer of minimum training corpus size -- Measurement: Yes or no: Size specification reported
Metric: Accessiblity of training corpus or techniques -- Method: Specification by the developer of interface / tools for training corpus -- Measurement: Yes or no: Training corpus is accessible
Metric: Specification for training corpus preparation -- Method: Provision by developer of training corpus preparation tools / documentation -- Measurement: Yes or no: Training corpus preparation tools / documentation is available
It is sometimes assumed that statistical MT systems constitute a new type of translation model. In fact, they implement one of the above-mentioned models in a different way, by building the lexicons, transfer rules, etc., unfortunately using large collections of data to learn from statistically. There is no new 'statistical MT program'. IBM's CANDIDE system (Della Pietra, et al.) and the EGYPT system (Knight, et al.) are examples of direct replacement systems involving some word order reorganization.