Coverage of corpus-specific phenomena

Definition

Coverage refers to the ability of the system to deal satisfactorily with linguistic phenomena, both generally addressing known cross-language phenomena and specifically addressing phenomena in a corpus of interest.

Coverage of corpus-based problematic phenomena concerns the ability of the system to deal with the particular challenges presented by a corpus of interest.

Metrics

By constituting a representative corpus and submitting it to the system in order to observe what errors occur.

Given a test suite of representative phenomena specific to the corpus of interest, low-level and aggregate measurements like those described in Cross-language phenomena (2.2.1.1.2/502) can be used.

Subjective human scoring on a 10-point scale.

References

Dorr 1990a.

Dorr 1990b.

Niessen, Och, Leusch, and Ney 2000.


View or add comments (504)