Publications
Unsupervised data processing for classifier-based speech translator
Abstract
Concept classification has been used as a translation method and has shown notable benefits within the suite of speech-to-speech translation applications. However, the main bottleneck in achieving an acceptable performance with such classifiers is the cumbersome task of annotating large amounts of training data. Any attempt to develop a method to assist in, or to completely automate, data annotation needs a distance measure to compare sentences based on the concept they convey. Here, we introduce a new method of sentence comparison that is motivated from the translation point of view. In this method the imperfect translations produced by a phrase-based statistical machine translation system are used to compare the concepts of the source sentences. Three clustering methods are adapted to support the concept-base distance. These methods are applied to prepare groups of paraphrases and use them …
- Date
- 2013
- Authors
- Emil Ettelaie, Panayiotis G Georgiou, Shrikanth S Narayanan
- Journal
- Computer Speech & Language
- Volume
- 27
- Issue
- 2
- Pages
- 438-454
- Publisher
- Academic Press