Corpora

Definition
The kinds and number of monolingual, comparable or parallel corpora available. The category of corpus will depend on the style of language modeling and statistical techniques used in the system.
Metrics

Metric: Types of corpora incorporated into the system. -- Method: Report by the developer. -- Measurement: List of described types of corpora - monolingual, comparable, parallel.

Metric: Number of each type of corpora incorporated into the system -- Method: Report by the developer. -- Measurement: List of described numbers of corpora, categorized by type.

Metric: Kinds of each type of corpora incorporated into the system. Beyond the type of corpora (monolingual, comparable, etc), there are the kinds. Kind will include domains, genre, dates of collection, etc. Method: Report by the developer. -- Measurement: List of described domains of corpora, categorized by type.

References
Notes

Note that the ease of update is covered in section 2.2.5.2


View or add comments (417)