According to White 2000, operational evaluations generally address the question of whether an MT system will actually serve its purpose in the context of its operational use. The primary factors include the cost-benefit of bringing the system into the overall process (costs).
Linguistic resources and utilities (2.1.2/403)
Interoperability (2.2.1.4/192)
Reliability (2.2.2/600)
Maintainability (2.2.5/620)
Portability (2.2.6/622)
Cost (2.2.7/624)
A variety of issues are considered here, including such things as software and hardware compatibility with the incumbent office automation system (interoperabililty). However, the more fundamental question to ask for operational use is whether the MT system enhances the effectiveness of the down stream task, or whether the end-to-end process is better off without it.
As an example, consider cross lingual information retrieval. Evaluation of MT embedded into a cross lingual information processing environment takes into account the measures that are germane to the downstream task. So if we want to know whether an MT system helps information extraction we compare the recall and precision (metrics germane to extraction) of the MT plus extraction configuration to an expert translation plus extraction process, or to an extraction without any translation at all. Note that we do not measure functionality characteristics of the MT system itself, such as fidelity and intelligibility, but rather the effect of the MT (good or bad) on the downstream task in term of that task's metrics. To a large extent then, operational evaluation lies outside the bounds of this classification, which is concerned only with the classification and evaluation of MT systems.