武凯文
USC/Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA 90292, (310) 448-8716, e-mail:
Senior Research Scientist and Fellow, Information Sciences Institute (ISI).
Research Associate Professor in the Department of Computer Science of the University of Southern California.
Research interests include: artificial intelligence, natural language processing, machine translation, machine learning, automata theory, decipherment.
My publications. My vita.
If you don't want to read this whole page, here's a quick summary.
I'm involved in natural language research at USC/ISI, where we have a range of projects.
I think our approach to syntax in machine translation is best described in D. Barthelme's short story They called for more structure....
The outside of my laptop.
You can download Tiburon, a tree toolkit developed by Jonathan May. Check it out. You can also download Jonathan Graehl's very fine finite-state string toolkit Carmel.
Richard Sproat and I gave a NAACL-09 tutorial on writing systems, transliteration, and decipherment. Check out these slides from a separate talk on the Voynich Manuscript.
I co-teach USC's CS562 (Empirical Methods in Natural Language Processing).
I was asked to run for a position in my professional association. A bogus platform somehow circulated, which I completely repudiate. Having now been elected, though, I feel strangely compelled to implement it.
I recently said this out loud: "I want a four-door convertible like the one Lincoln was assassinated in."
I have been fortunate to work with great students. PhD graduates include Yaser Al-Onaizan (whose thesis was on named-entity translation, and who is now a researcher at IBM T. J. Watson), Kenji Yamada (whose thesis was on syntax-based statistical machine translation, now at MySpace), Irene Langkilde-Geary (whose thesis was on language generation, now at University of Brighton), and Philipp Koehn (whose thesis was on noun phrase translation and is now on the faculty at University of Edinburgh). I also supervised Ishwar Chander's PhD thesis on statistical article generation. After that, I co-advised Shou-de Lin's PhD (now on the faculty at the National University of Taiwan) and Liang Huang's PhD (now at Google), and Bryant Huang did an excellent Masters thesis on syntax-structuring for translation.
I recently went to Denmark. Here's me in Denmark.
Here's a story called Little House on the West Bank. No matter how readers interpret the text, my favorite part is still where the baby throws up on the mom.
Here's a gruesome fairy tale for you.
Here is a slide show that explains the global situation to American people.
I'm possibly the least-skilled person at drawing ever to get paid for it, for example: actual conversations ... strange things ... parking structures ... werld maps.
Fortune-telling cards.
A machine translation glossary for those in the field.
My idol is Fred Jelinek.
Here's a big and free French/English parallel corpus Ulrich Germann built.
Don't click here.
I wrote a book for little kids, and another book for big kids.
I used to be an Amber freak, a long time ago, and kept myself amused. Since then, three of my favorite writers are Philip K. Dick, Jack Womack, and Jim Thompson. I've read Thompson's Pop 1280 quite a few times -- that is a god-danged good book. Other favorites: A Scanner Darkly, Man in the High Castle, Confessions of a Crap Artist, Puttering about in a Small Land, Now Wait for Last Year, The Penultimate Truth, Ambient, Terraplane, Heathern, Elvissey, Random Acts of Senseless Violence, and Let's Put the Future Behind Us. I like also Don Delillo, David Lodge, Kurt Vonnegut, and J. D. Salinger. Seems like the first couple of books are always better than all the rest. I like Bob Dylan's music and found his autobiography (Chronicles, Vol. 1) strangely similar Neil Armstrong's biography (First Man). Other books: Tao Te Ching, and Benjamin Tucker's Instead of a Book.
2009
"Synchronous Tree Adjoining Machine Translation", (S. DeNeefe and K. Knight), Proc. EMNLP, 2009. Get paper in PDF.
"Minimized Models for Unsupervised Part-of-Speech Tagging", (S. Ravi and K. Knight), Proc. ACL, 2009. Get paper in PDF.
"Fast Consensus Decoding over Translation Forests", (J. DeNero, D. Chiang, and K. Knight), Proc. ACL, 2009. Get paper in PDF.
"Learning Phoneme Mappings for Transliteration without Parallel Data", (S. Ravi and K. Knight), Proc. NAACL, 2009. Get paper in PDF.
"11,001 New Features for Statistical Machine Translation", (D. Chiang, K. Knight, and W. Wang), Proc. NAACL, 2009. Best Paper Award. Get paper in PDF.
"Combining Constituent Parsers", (V. Fossum and K. Knight), Proc. NAACL (short paper), 2009. Get paper in PDF.
"Faster MT Decoding through Pervasive Laziness", (M. Pust and K. Knight), Proc. NAACL (short paper), 2009. Get paper in PDF.
"A New Objective Function for Word Alignment", (T. Bodrumlu, K. Knight, and S. Ravi), Proc. NAACL Workshop on Integer Linear Programming for NLP, 2009. Get paper in PDF.
"The Power of Extended Top-Down Tree Transducers", (A. Maletti, J. Graehl, M. Hopkins, and K. Knight), SIAM J. Comput. 39(2), pp. 410-430, 2009. Get paper in PDF.
"Probabilistic Methods for a Japanese Syllable Cipher", (S. Ravi and K. Knight), Proc. International Conference on the Computer Processing of Oriental Languages, 2009. Get paper in PDF.
"Applications of Weighted Automata in Natural Language Processing", (K. Knight and J. May), Handbook of Weighted Automata (M. Droste, W. Kuich, H. Vogler, eds.), 2009.
"Attacking Decipherment Problems Optimally with Low-Order N-gram Models", (S. Ravi and K. Knight), Cryptologia (to appear), 2009.
2008
"Training Tree Transducers", (J. Graehl, K. Knight, and J. May), Computational Linguistics, 34(3), 2008. Get paper in PDF from MIT Press.
"Capturing Practical Natural Language Transformations", (K. Knight), Machine Translation, 21(2), 2007. Get draft in PDF. The complete publication is available at www.springerlink.com.
"Using Syntax to Improve Word Alignment Precision for Syntax-Based Machine Translation", (V. Fossum, K. Knight, and S. Abney), Proc. Workshop on Statistical MT, ACL, 2008. Get paper in PDF.
"Name Translation in Statistical Machine Translation: Learning When to Transliterate", (U. Hermjakob, K. Knight, and H. Daume III), Proc. ACL, 2008. Get paper in PDF.
"Attacking Decipherment Problems Optimally with Low-Order N-gram Models", (S. Ravi and K. Knight), Proc. EMNLP, 2008. Get paper in PDF.
"Automatic Prediction of Parser Accuracy", (S. Ravi, K. Knight, and R. Soricut), Proc. EMNLP, 2008. Get paper in PDF.
"Overcoming Vocabulary Sparsity in MT Using Lattices", (S. DeNeefe, U. Hermjakob, and K. Knight), Proc. AMTA, 2008. Get paper in PDF.
"Using Bilingual Chinese-English Word Alignments to Resolve PP-Attachment Ambiguity in English", (V. Fossum and K. Knight), Proc. AMTA (student session), 2008. Get paper in PDF.
2007
"Syntactic Re-Alignment Models for Machine Translation", (J. May and K. Knight), Proc. EMNLP-CoNLL, 2007. Get paper in PDF."What Can Syntax-Based MT Learn from Phrase-Based MT?", (S. DeNeefe, K. Knight, W. Wang, D. Marcu), Proc. EMNLP-CoNLL, 2007. Get paper in PDF.
"Binarizing Syntax Trees to Improve Syntax-Based Machine Translation Accuracy", (W. Wang, K. Knight and D. Marcu), Proc. EMNLP-CoNLL, 2007. Get paper in PDF.
2006
"Statistical Syntax-Directed Translation with Extended Domain of Locality", (L. Huang, K. Knight, A. Joshi), Proc. AMTA (poster), 2006. Get paper in PDF.
"Tiburon: A Weighted Tree Automata Toolkit", (J. May and K. Knight), Proc. International Conference on Implementation and Application of Automata (CIAA), Lecture Notes in Computer Science (Springer), v. 4094/2006, 2006. Get paper in PDF.
"Building an English-Iraqi Arabic Machine Translation System for Spoken Utterances with Limited Resources", (J. Riesa, B. Mohit, K. Knight, D. Marcu), Proc. Interspeech, 2006. Get paper in PDF.
"Unsupervised Analysis for Decipherment Problems", (K. Knight, A. Nair, N. Rathod, and K. Yamada), Proc. ACL-COLING (poster), 2006. Get paper in PDF.
"Scalable Inference and Training of Context-Rich Syntactic Models", (M. Galley, J. Graehl, K. Knight, D. Marcu, S. DeNeefe, W. Wang, and I. Thayer), Proc. ACL-COLING, 2006. Get paper in PDF.
"Discovering the Linear Writing Order of a Two-Dimensional Ancient Hieroglyphic Script", (Shou-de Lin and Kevin Knight), Artificial Intelligence, volume 170(4-5), 2006. Get draft in PDF.
"A Better N-Best List: Practical Determinization of Weighted Finite Tree Automata", (J. May and K. Knight), Proc. NAACL-HLT, 2006. Get paper in PDF.
"Capitalizing Machine Translation", (W. Wang, K. Knight, D. Marcu), Proc. NAACL-HLT, 2006. Get paper in PDF.
"Synchronous Binarization for Machine Translation", (H. Zhang, L. Huang, D. Gildea, K. Knight), Proc. NAACL-HLT, 2006. Get paper in PDF.
"Relabeling Syntax Trees to Improve Syntax-Based Machine Translation Quality", (B. Huang and K. Knight), Proc. NAACL-HLT, 2006. Get paper in PDF.
2005
"Transonics: A Practical Speech-to-Speech Translator for English-Farsi Medical Dialogs", (Robert Belvin, Emil Ettelaie, Sudeep Gandhe, Panayiotis Georgiou, Kevin Knight, Daniel Marcu, Scott Millward, Shrikanth Narayanan, Howard Neely, David Traum), Proc. ACL poster/demo, 2005.
"Interactively Exploring a Machine Translation Model", (Steve DeNeefe, Kevin Knight, and Hayward H. Chan), Proc. ACL poster/demo, 2005. Get paper in PDF.
"Machine Translation in the Year 2004", (K. Knight, D. Marcu), Proc. ICASSP, 2005. Get paper in PDF.
"An Overview of Probabilistic Tree Transducers for Natural Language Processing", (K. Knight, J. Graehl), Proc. of the Sixth International Conference on Intelligent Text Processing and Computational Linguistics (CICLing), Lecture Notes in Computer Science, copyright Springer Verlag, 2005. Get paper in PostScript. Get paper in PDF.
2004
"Text Simplification for Information Seeking Applications", (B. Beigman Klebanov, K. Knight, D. Marcu), In: On the Move to Meaningful Internet Systems, eds. R. Meersman and Z. Tari, Lecture Notes in Computer Science (3290), copyright Springer Verlag, 2004.
"What's in a Translation Rule?", (M. Galley, M. Hopkins, K. Knight, D. Marcu), Proc. NAACL-HLT, 2004. Get paper in PDF.
"Training Tree Transducers", (J. Graehl, K. Knight), Proc. NAACL-HLT, 2004. Get paper in PDF.
"The Transonics Spoken Dialogue Translator: An Aid for English-Persian Doctor-Patient Interviews", (S. Narayanan, S. Ananthakrishnan, R. Belvin, E. Ettaile, S. Gandhe, S. Ganjavi, P. G. Georgiou, C. M. Hein, S. Kadambe, K. Knight, D. Marcu, H. E. Neely, N. Srinivasamurthy, D. Traum, and D. Wang), AAAI Fall Symposium, 2004.
2003
"Syntax-based Language Models for Machine Translation", (E. Charniak, K. Knight, and K. Yamada), Proc. MT Summit IX, 2003. Get paper in PDF.
"Feature-Rich Statistical Translation of Noun Phrases", (P. Koehn and K. Knight), Proc. ACL, 2003. Get paper in PDF.
"Syntax-based Alignment of Multiple Translations: Extracting Paraphrases and Generating New Sentences", (B. Pang, K. Knight, and D. Marcu), Proc. NAACL-HLT, 2003. Get paper in PDF.
"Transonics: A Speech to Speech System for English-Persian Interactions", (S. Narayanan, S. Ananthakrishnan, R. Belvin, E. Ettaile, S. Ganjavi, P. Georgiou, C. Hein, S. Kadambe, K. Knight, D. Marcu, H. Neely, N. Srinivasamurthy, D. Traum, and D. Wang), Proc. IEEE ASRU, 2003.
"Finding the WRITE Stuff: Automatic Identification of Discourse Structure in Student Essays", (J. Burstein, D. Marcu, and K. Knight), IEEE Intelligent Systems, Jan/Feb, 2003.
"Translation with Scarce Bilingual Resources", (Y. Al-Onaizan, U. Germann, U. Hermjakob, K. Knight, P. Koehn, D. Marcu, K. Yamada), Machine Translation, 2003.
"Fast Decoding and Optimal Decoding for Machine Translation", (U. Germann, M. Jahr, K. Knight, D. Marcu, and K. Yamada), Artificial Intelligence, 2003.
"Teaching Statistical Machine Translation", (K. Knight), Proc. MT Summit IX Workshop on Teaching Machine Translation, 2003. Get paper in PDF.
"Cognates Can Improve Statistical Translation Models", (G. Kondrak, D. Marcu, and K. Knight), Proc. NAACL-HLT, 2003. Get paper in PDF.
"Desperately Seeking Cebuano", (Douglas W. Oard, David Doermann, Bonnie Dorr, Daqing He, Philip Resnik, Amy Weinberg, William Byrne, Sanjeev Khudanpur, David Yarowsky, Anton Leuski, Philipp Koehn, and Kevin Knight), Proc. NAACL-HLT, 2003.
2002
"Using a Large Monolingual Corpus to Improve Translation Accuracy", (R. Soricut, K. Knight, and D. Marcu), Proceedings of the 6th AMTA Conference, 2002.
"Learning a Translation Lexicon from Monolingual Corpora", (P. Koehn and K. Knight), Proc. of ACL Workshop on Unsupervised Lexical Acquisition, 2002.
"Summarization Beyond Sentence Extraction: A Probabilistic Approach to Sentence Compression", (K. Knight and D. Marcu), Artificial Intelligence, 139(1), 2002.
"Named Entity Translation: Extended Abstract", (Y. Al-Onaizan and K. Knight), Proc. HLT, 2002. Get paper in Postscript.
"Translating Named Entities Using Monolingual and Bilingual Resources", (Y. Al-Onaizan and K. Knight), Proc. of the Conference of the Association for Computational Linguistics (ACL), 2002. Get paper in Postscript.
"Machine Transliteration of Names in Arabic Text", (Y. Al-Onaizan and K. Knight), Proc. of ACL Workshop on Computational Approaches to Semitic Languages, 2002. Get paper in Postscript.
"A Decoder for Syntax-Based Statistical MT", (K. Yamada and K. Knight), Proc. of the Conference of the Association for Computational Linguistics (ACL), 2002. Get paper in Postscript.
2001
"A Syntax-Based Statistical Translation Model", (K. Yamada and K. Knight), Proc. of the Conference of the Association for Computational Linguistics (ACL), 2001. Get paper in Postscript.
"Fast Decoding and Optimal Decoding for Machine Translation", (U. Germann, M. Jahr, K. Knight, D. Marcu, and K. Yamada), Proc. of the Conference of the Association for Computational Linguistics (ACL), 2001. Best Paper Award. Get paper in PDF.
"Knowledge Sources for Word-Level Translation Models", (P. Koehn and K. Knight), Empirical Methods in Natural Language Processing conference (EMNLP), 2001. Get paper in Postscript.
2000
"Translating with Scarce Resources", (Y. Al-Onaizan, U. Germann, U. Hermjakob, K. Knight, P. Koehn, D. Marcu, K. Yamada), National Conference on Artificial Intelligence (AAAI), 2000. Get paper in Postscript.
"Preserving Ambiguities in Generation via Automata Intersection", (K. Knight and I. Langkilde), National Conference on Artificial Intelligence (AAAI), 2000. Get paper in Postscript.
"Estimating Word Translation Probabilities from Unrelated Monolingual Corpora Using the EM Algorithm", (P. Koehn and K. Knight), National Conference on Artificial Intelligence (AAAI), 2000. Get paper in Postscript.
"Statistics-Based Summarization --- Step One: Sentence Compression", (K. Knight and D. Marcu), National Conference on Artificial Intelligence (AAAI), 2000. Outstanding Paper Award. Get paper in Postscript.1999
"Decoding Complexity in Word-Replacement Translation Models", Computational Linguistics, Squibs & Discussion, 25(4), 1999. Get paper in Postscript (long version).
"Mining Online Text", Communications of the ACM, 42(11), November 1999.
"A Statistical MT Tutorial Workbook," unpublished, August 1999. Get paper in PDF. Get paper in Word.
"A Computational Approach to Deciphering Unknown Scripts", (K. Knight and K. Yamada), Proceedings of the ACL Workshop on Unsupervised Learning in Natural Language Processing, 1999. Get paper in PostScript.
1998
"Machine Transliteration", (K. Knight and J. Graehl), Computational Linguistics, 24(4), 1998. Get conference version in PDF.
"Translation with Finite-State Devices," (K. Knight and Y. Al-Onaizan), Proceedings of the 4th AMTA Conference, 1998. Get paper in PostScript.
"Generation that Exploits Corpus-based Statistical Knowledge", (I. Langkilde and K. Knight), Proc. of the Conference of the Association for Computational Linguistics (COLING/ACL), 1998. Get paper in PostScript.
"Translating Names and Technical Terms in Arabic Text", (B. Stalls and K. Knight), Proc of the COLING/ACL Workshop on Computational Approaches to Semitic Languages, 1998. Get paper in PostScript.
"The Practical Value of N-Grams in Generation", (I. Langkilde and K. Knight), Proc. of the International Natural Language Generation Workshop, 1998. Get paper in PostScript.
1997
"Automating Knowledge Acquisition for Machine Translation", AI Magazine 18(4), 1997. Get paper in PostScript.
"Machine Transliteration", (K. Knight and J. Graehl), Proc. of the Conference of the Association for Computational Linguistics (ACL), 1997. Get paper in PDF.
1996
"It's Time for Your Evaluation", special section, What Makes a Compelling Empirical Evaluation in AI, IEEE Expert, 11(5), 1996.
"Learning Word Meanings by Instruction", Proc. of the National Conference on Artificial Intelligence (AAAI), 1996. Get paper in PostScript.
1995
"Two-Level, Many-Paths Generation", (K. Knight and V. Hatzivassiloglou), Proc. of the Conference of the Association for Computational Linguistics (ACL), 1995. Get paper in PostScript.
"Filling Knowledge Gaps in a Broad-Coverage MT System", (K. Knight, I. Chander, M. Haines, V. Hatzivassiloglou, E. Hovy, M. Iida, S. K. Luk, R. Whitney, and K. Yamada), Proc. of the International Joint Conference on Artificial Intelligence (IJCAI), 1995. Get paper in PostScript.
"Unification-Based Glossing", (V. Hatzivassiloglou and K. Knight), Proc. of the International Joint Conference on Artificial Intelligence (IJCAI), 1995. Get paper in PostScript.
1994/earlier
"Integrating Knowledge Bases and Statistics in MT", (K. Knight, I. Chander, M. Haines, V. Hatzivassiloglou, E. Hovy, M. Iida, S. Luk, A. Okumura, R. Whitney, K. Yamada), Proc. of the Conference of the Association for Machine Translation in the Americas (AMTA), 1994. Get paper in PostScript.
"Building a Large-Scale Knowledge Base for Machine Translation", (K. Knight and S. Luk), Proc. of the National Conference on Artificial Intelligence (AAAI), 1994. Get paper in PostScript.
"Automated Postediting of Documents", (K. Knight and I. Chander), Proc. of the National Conference on Artificial Intelligence (AAAI), 1994. Get paper in PostScript.
"Are Many Reactive Agents Better Than a Few Deliberative Ones?" Proc. of the International Joint Conference on Artificial Intelligence (IJCAI), 1993.
"Unification", The Encyclopedia of Artificial Intelligence, Second Edition, John Wiley and Sons, 1992.
"Integrating Knowledge Acquisition and Language Acquisition", Applied Intelligence, 1(1), 1992.
Artificial Intelligence, Second Edition (E. Rich and K. Knight), McGraw-Hill Book Company, 1991.
"Connectionist Ideas and Algorithms", Communications of the ACM, 33(11), November, 1990.
"Knowledge and Natural Language Processing", (J. Barnett, K. Knight, I. Mani, and E. Rich), Communications of the ACM, 33(8), August, 1990.
"Unification: A Multidisciplinary Survey,", ACM Computing Surveys, 21(1), 1989. Get paper in PDF.