University of Southern California
Viterbi School of Engineering
Information Sciences Institute
4676 Admiralty Way, Suite 1001
Marina del Rey, CA 90292
Natural Language Processing
My PhD Advisor is Daniel Marcu.
My focus has been to push the state-of-the-art in machine translation by marrying scalable and effective learning and inference with accurate syntactic models of word alignment. I also have many related interests, such as parallel text discovery, morphological modeling and generation, reordering and data visualization. Technical correspondence always welcome.
I successfully defended my thesis, Syntactic Alignment Models for Large-Scale Statistical Machine Translation, on March 21, 2012.
Automatic Parallel Fragment Extraction from Noisy Data. . In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), 2012. [slides] [keynote] [code]
Feature-Rich Language-Independent Syntax-Based Alignment for Statistical Machine Translation. . In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2011. [slides] [keynote] [code]
Hierarchical Search for Word Alignment. . In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL), 2010. [slides]
Minimally Supervised Morphological Segmentation with Applications to Machine Translation. . In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas (AMTA), 2006.
Building an English-Iraqi Arabic Machine Translation System for Spoken Utterances with Limited Resources. . In Proceedings of Interspeech - ICSLP, 2006.
- Nile: A hierarchical syntax-based word alignment toolkit for statistical machine translation
- Picaro: A simple command-line word-alignment visualization tool
- Pyglog: A logging facility for Python.
- Alignment Visualization (web tool): A web-based version of Picaro.
- SBMT Rule Extraction (web tool): Visualize alignments and extract translation rules.
- Research Intern, Google Inc, Summer 2011.
- CSCI 562 - Empirical Methods in Natural Language Processing, Teaching Assistant, Fall 2010.
- NAACL HLT 2010, Los Angeles, Local Organizing Committee,
with David Chiang, Ed Hovy, and Jonathan May.
- USC/ISI NL Seminar, Organizer, Fall 2009, Spring 2010, Summer 2010, Fall 2011, Spring 2012.
- USC/ISI NLP Reading Group, Organizer, Fall 2008, Spring 2009.
- 2nd Annual ISD Graduate Student Symposium, Co-Organizer,
with Wes Kerr and Jonathan May.