
- Office
- Information Sciences Institute 940
- Mailing address
- USC Information Sciences Institute
4676 Admiralty Way, Suite 1001
Marina del Rey, CA 90292
- Phone
- (O) +1-310-448-9341
- (F) +1-310-822-0751
- E-mail
- my last name @isi.edu
Computer Scientist (ISI)
Research Assistant Professor (Computer Science)
My research is in natural language processing, the subfield of computer science that aims to enable computers to understand and produce human language. I focus mainly on language translation, and am interested in syntactic parsing and other areas as well. Curriculum
vitae
Current and recent activities
Students
Selected publications
Machine translation
- Hope and fear for discriminative training of statistical translation models. To appear in J. Machine Learning Research.
- Rule Markov models for fast tree-to-string translation. Ashish
Vaswani, Haitao Mi, Liang Huang and David Chiang, 2011. In
Proc. ACL, pages 856–864. [pdf] [bib]
- Two easy improvements to lexical weighting. With Steve DeNeefe and
Michael Pust, 2011. In Proc. ACL, poster session, pages 455–460. [pdf] [bib]
- Learning to translate with source and target
syntax. 2010. In Proc. ACL, pages 1443–1452. [pdf]
Slides: [pdf,
2.5M]
- Unsupervised syntactic alignment with inversion transduction
grammars. Adam Pauls, Dan Klein, David Chiang, and Kevin Knight,
2010. In Proc. NAACL HLT, pages 118–126. [pdf]
- Fast consensus decoding over translation forests. John DeNero,
David Chiang, and Kevin Knight, 2009. In Proc. ACL. [pdf]
- 11,001 new features for statistical machine translation. With Wei
Wang and Kevin Knight, 2009. In Proc. NAACL HLT, pages
218–226. Best paper
award. [pdf]
Slides: [pdf, 2.9M]
- Online large-margin training of syntactic and structural translation features. With Yuval Marton and Philip Resnik, 2008. In Proc. EMNLP, pages 224–233. [pdf]
- Decomposability of translation metrics for improved evaluation and efficient algorithms. With Steve DeNeefe, Yee Seng Chan, and Hwee Tou Ng, 2008. In Proc. EMNLP, pages 610–619. [pdf]
- Hierarchical phrase-based translation. 2007.
Computational Linguistics 33(2):201–228.
[pdf] Expands on the ACL 2005 paper
below with a description, including experimental results, of the
decoding algorithm (“cube pruning”).
- Forest rescoring: faster decoding with integrated language
models. Liang Huang and David Chiang, 2007. In Proc. ACL. [pdf]
- Word sense disambiguation improves statistical machine
translation. Yee Seng Chan, Hwee Tou Ng, and David Chiang, 2007. In
Proc. ACL. [pdf]
- A hierarchical phrase-based model for statistical machine
translation. 2005. In Proc. ACL, pages 263–270. Best paper award. [pdf]
Other natural language processing
- Language-independent parsing with empty elements. Shu Cai, David
Chiang, and Yoav Goldberg, 2011. In Proc. ACL, pages 212–216. [pdf] [bib]
- Efficient optimization of an MDL-inspired objective function for
unsupervised part-of-speech tagging. Ashish Vaswani, Adam Pauls, and David Chiang,
2010. In Proc. ACL, pages 209–214. [pdf]
- Bayesian inference for finite-state transducers. With Jonathan
Graehl, Kevin Knight, Adam Pauls, and Sujith Ravi,
2010. In Proc. NAACL HLT, pages 447–455. [pdf]
- The hidden TAG model: synchronous grammars for parsing
resource-poor languages. With Owen Rambow, 2006. In Proc. TAG+8, pages 1–8. [pdf]
- Parsing Arabic dialects. In Proc. EACL. With
Mona Diab, Nizar Habash, Owen Rambow, and Safiullah Shareef, 2006. [pdf]
- Better k-best parsing. In Proc. IWPT. Liang Huang
and David Chiang, 2005. [pdf]
- Recovering latent information in treebanks. With Daniel M. Bikel, 2002. In Proc. COLING, pages 183–189. [pdf]
- Statistical parsing with an automatically extracted tree adjoining grammar. 2003. In Data Oriented Parsing, CSLI Publications, pages 299–316. Combines and updates results from two papers, from ACL 2000 and the colocated Chinese workshop. [pdf]
Biological sequence analysis
- Computational linguistics: a new tool for exploring biopolymer
structures and statistical mechanics. Ken A. Dill, Adam Lucas,
Julia Hockenmaier, Liang Huang, David Chiang, and Aravind K.
Joshi, 2007. Polymer 48:4289–4300.
- Grammatical representations of macromolecular structure. With
Aravind K. Joshi and David B. Searls, 2006. J. Computational
Biology 13(5):1077–1100.
- A grammatical theory for the conformational changes of simple
helix bundles. With Aravind K. Joshi and Ken A. Dill,
2006. J. Computational Biology 13(1):21–42.
Formal grammars and their applications
- Flexible composition and delayed tree-locality. With Tatjana
Scheffler, 2008. In Proc. TAG+9, pages 17–24. [pdf]
- The weak generative capacity of linear tree adjoining
grammars. 2006. In Proc. TAG+8, pages 25–32. [pdf]
- An introduction to synchronous grammars: part of a tutorial given at ACL 2006 with Kevin Knight. Notes: [pdf] Slides: [pdf]
- Evaluation of Grammar Formalisms for Applications to Natural
Language Processing and Biological Sequence Analysis, PhD
dissertation, University of Pennsylvania, 2004. Rubinoff Award. [pdf, 600k] Slides from defense: [pdf] Published as: Grammars for Language and Genes: Theoretical and Empirical Investigations, Springer, 2012.
- Uses and abuses of intersected languages. 2004. In Proc. TAG+7, pages 9–15. [pdf]
- Putting some weakly context-free formalisms in order. 2002. In Proc. TAG+6, pages 11–18. [pdf]
- Some remarks on an extension of synchronous TAG. With William Schuler and Mark Dras, 2000. In Proc. TAG+5, pages 61–66. [pdf]
Software and other stuff