MT Models

Definition

Every MT system embodies some theory of language and of translation. Usually most of the theoretical assumptions are implicit, possibly not even known to the developer.

The simplest MT systems perform direct replacement of terms and phrases in the source language with target language equivalents. In addition, rudimentary word order changes may often be performed. Example-based systems (EMBT) are one type of this class; they replace phrases or even whole paragraphs at a time.

More sophisticated MT systems try to improve syntactic (grammatical) quality by analyzing the source sentence into a syntax tree and then converting the tree into the form required by the target syntax (for example, moving the verb complex). At the cost of building grammars and parsers, such syntactic transfer systems produce higher quality.

One level more complex, semantic transfer systems analyse the source text into some formalism that is intended to capture meaning, not just grammatical form. The formalisms used by shallow semantic systems are not fully language-independent and hence require some transformations into target form.

The most complex systems analyse the input into a language-neutral interlingual formalism, from which many target languages can be directly generated. No wide coverage interlingua has get been developed.

These levels of translation have been represented by the so-called MT triangle (Vauquois).

In general, the more sophisticated the internal processing, the higher the output quality, but the more domain-specific and brittle the system. Most modern working systems include a blend of syntactic and semantic transfer.

Metrics
References

Hutchins and Somers, 1992.

Arnold et al., 1994.

Notes

In practice, MT systems are not solely at one level of processing. In fact, fall back to a less complex strategy is often indicated when errors occur.


View or add comments (406)