Description

Members

Publications

Demos

Funding

Research Home

ReWrite
A Probabilistic Approach to Rewriting for Machine Translation and Abstracting

Publications

Galley, M., Hopkins, M., Knight, K. and Marcu, D. 2004. What's in a translation rule? Proceedings of the Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting (NAACL-HLT, 2004), Boston, MA. PDF

Graehl, J. and Knight, K. 2004. Training Tree Transducers. Proceedings of the Human Language Technology conference - North American chapter of the Association for Computational Linguistics annual meeting (NAACL-HLT, 2004), Boston, MA. PostScript

Al-Onaizan, Y., Germann, U., Hermjakob, U., Knight, K., Koehn, P., Marcu, D. and Yamada, K. 2003. Translation with Scarce Bilingual Resources. Machine Translation, 2003.

Charniak, E., Knight, K. and Yamada K. 2003. Syntax-based Language Models for Machine Translation. Proceedings of MT Summit IX 2003. New Orleans. PDF

Germann, U. 2003. Greedy Decoding for Statistical Machine Translation in Almost Linear Time. Proceedings of HLT-NAACL 2003. Edmonton, AB, Canada. PDF

We present improvements to a greedy decoding algorithm for statistical machine translation that reduce its time complexity from at least cubic O6 when applied naively) to practically linear t me without sacrificing translation quality. We achieve this by integrating hypothesis evaluation into hypothesis creation, tiling improvements over the translation hypothesis at the end of each search iteration, and by imposing restrictions on the amount of word reordering during decoding.

Germann, U., Jahr, M., Knight, K., Marcu, D. and Yamada, K. 2003. Fast Decoding and Optimal Decoding for Machine Translation. Artificial Intelligence, 2003.

Kondrak, G., Marcu, D. and Knight, K. 2003. Cognates Can Improve Statistical Translation Models. HLT-NAACL 2003, companion volume, pp. 46-48. Edmonton, AB, Canada. PDF

Oard, D.W., Doermann, D., Dorr, B., He, D., Resnik, P., Weinberg, A., Byrne, W., Khudanpur, S., Yarowsky, D., Leuski, A., Koehn, P., and Knight, K. 2003. Desperately Seeking Cebuano. Proceedings of HLT-NAACL 2003. Edmonton, AB, Canada.

Al-Onaizan, Y. and Knight K. 2002. Named Entity Translation: Extended Abstract. Proceedings of HLT-02. San Diego. PostScript

Al-Onaizan, Y. and Knight K. 2002. Translating Named Entities Using Monolingual and Bilingual Resources. Proceedings of ACL-02. Philadelphia. PostScript

Named entity phrases are some of the most difficult phrases to translate because new phrases can appear from nowhere, and because many are domain specific, not to be found in bilingual dictionaries. We present a novel algorithm for translating named entity phrases using easily obtainable monolingual and bilingual resources. We report on the application and evaluation of this algorithm in translating Arabic named entities to English. We also compare our results with the results obtained from human translations and a commercial system for the same task.

Al-Onaizan, Y. and Knight K. 2002. Machine Transliteration of Names in Arabic Text. Proceedings of ACL Workshop on Computational Approaches to Semitic Languages. Philadelphia. PostScript

We present a transliteration algorithm based on sound and spelling mappings using finite state machines. The transliteration models can be trained on relatively small lists of names. We introduce a new spelling-based model that is much more accurate than state-of-the-art phonetic-based models and can be trained on easier-to-obtain training data. We apply our transliteration algorithm to the transliteration of names from Arabic into English. We report on the accuracy of our algorithm based on exact-matching criterion and based on human-subjective evaluation. We also compare the accuracy of our system to the accuracy of human translators.

Koehn, P., and Knight, K. 2002. Learning a Translation Lexicon from Monolingual Corpora. Proceedings of ACL-2002 Workshop on Unsupervised Lexical Acquisition. Philadelphia, PA.

Knight, K. and Marcu, D. 2002. Summarization Beyond Sentence Extraction: A Probabilistic Approach to Sentence Compression. Artificial Intelligence, 139(1).

Soricut, R., Knight, K., and Marcu, D. 2002. Using a Large Monolingual Corpus to Improve Translation Accuracy. Proceedings of 6th Conference of the Association for Machine Translation in the Americas (AMTA-2002). Tiburon, CA, October 8-12. PDF

Yamada, K. 2002. A Syntax-based Statistical Translation Model. Ph.D. thesis. University of Southern California. PostScript

Yamada, K. and Knight, K. 2002. A Decoder for Suntax-based Statistical MT. Proceedings of the Conference of the Association for Computational Linguistics (ACL-2002). Philadelphia. PostScript

Germann, U., Jahr, M., Knight, K., Marcu, D. and Yamada, K. 2001. Fast Decoding and Optimal Decoding for Machine Translation. Proceedings of the Conference of the Association for Computational Linguistics (ACL-2001), Toulouse, France, July 2001. Best Paper Award. PDF

Koehn, P. and Knight, K. 2001. Knowledge Sources for Word-Level Translation Models. Empirical Methods in Natural Language Processing conference (EMNLP 2001). PostScript

Yamada, K. and Knight, K. 2001. A Syntax-based Statistical Translation Model. Proceedings of the Conference of the Association for Computational Linguistics (ACL-2001). PostScript

Al-Onaizan, Y., Germann, U., Hermjakob, U., Knight, K., Koehn, P., Marcu, D. and Yamada, K. 2000. Translating with Scarce Resources. National Conference on Artificial Intelligence (AAAI), Austin, Texas, July 30-August 3, 2000. PostScript

Koehn, P. and Knight, K. 2000. Estimating Word Translation probabilities from Unrelated Monolingual Corpora Using the EM Algorithm. National Conference on Artificial Intelligence (AAAI). PostScript

Knight, K. and Marcu, D. 2000. Statistical-based Summarization --- Step One: Sentence Compression. National Conference on Artificial Intelligence (AAAI), Austin, Texas, July 30-August 3, 2000. Oustanding Paper Award. PostScript

Knight, K. 1999. Decoding Complexity in Word-Replacement Translation Modals. Computational Linguistics, Squibs and Discussion, 25(4), 1999. PostScript

Knight, K. 1999. Mining Online Text. Communications of the ACM, 42(11), November, 1999.

Knight, K. 1999. A Statistical MT Tutorial Workbook. Unpublished, August 1999. PDF

Knight, K. and Yamada, K. 1999. A Computational Approach to Deciphering Unknown Scripts. Proceedings of the ACL Workshop on Unsupervised Learning in Natural Language Processing. PostScript