Education
- Ph.D. in Computer and Information Science, University of Pennsylvania, 2004. Advisor: Aravind K. Joshi. Dissertation: Evaluation of Grammar Formalisms for Applications to Natural Language Processing and Biological Sequence Analysis. Published as: Grammars for Language and Genes: Theoretical and Empirical Investigations, Springer, 2012.
- S.M. in Computer Science, Harvard University, 1997.
- A.B. cum laude in Computer Science, Harvard University, 1997.
Honors and awards
- Best paper award, with W. Wang and K. Knight, NAACL HLT 2009
- Best paper award, ACL 2005
- Morris and Dorothy Rubinoff Award, 2005, University of Pennsylvania, for a dissertation that represents an advance in innovative application of computer technology
- Phi Beta Kappa, fall 1996, and Detur Book Prize, 1994, Harvard University, awarded to top 5% of class
Professional experience
- 2013–present: Project leader, USC Information Sciences Institute.
- July 2007–present: Research assistant professor, USC
Department of Computer Science.
- 2006–2012: Computer scientist, USC Information Sciences Institute.
- Summer 2005: Senior researcher, Johns Hopkins CLSP Summer Workshop.
- Sep 2004–Dec 2005: Postdoctoral research associate, Univ. of Maryland Institute for Advanced Computer Studies.
Teaching experience
- PhD students: Ashish Vaswani, Hui Zhang, Theerawat Songyot
- Summer interns:
Michael Bloodgood (Delaware), Wei Ho (Princeton), Paramveer Dhillon (Penn), John DeNero (Berkeley), Amittai Axelrod (Washington), Adam Pauls (Berkeley), Yoav Goldberg (Ben Gurion), Anne Irvine (JHU), Xuchen Yao (JHU), Ada Wan (CUNY), Jackie Lee (MIT), Arvind Neelakantan (Columbia/UMass)
- PhD thesis committees
- external: Adam Lopez (Univ. of Maryland), 2008; Hendra Setiawan (Natl. Univ. of Singapore), 2008; John DeNero (Berkeley), 2010; Kevin Gimpel (CMU), 2012; Baskaran Sankaran (Simon Fraser), 2013; Andrea Gesmundo (Geneva), 2013
- internal: Jonathan May, 2010; Sujith Ravi, 2011; Stephen Tratz, 2011; Steve DeNeefe, 2011; Dirk Hovy, 2013
- Fall 2007–2009, 2011–2012: Instructor, Empirical Methods in Natural Language
Processing, Univ. of Southern California, with K. Knight and L. Huang
- Spring 2011: Instructor, Statistical Machine Translation, Univ. of Southern California, with K. Knight and L. Huang
- Spring 2007 and 2008: Instructor, Natural Language Processing,
Univ. of Southern California, with E. Hovy et al.
- Fall 2005: Instructor, Computational Linguistics I, Univ. of Maryland, with Philip Resnik
Publications
Journal articles
- D. Chiang, 2012. Hope and fear for discriminative training of statistical
translation models. J. Machine Learning Research 13:1159–1187.
- Y. Marton, D. Chiang, and P. Resnik. 2012. Soft syntactic constraints for Arabic-English hierarchical phrase-based translation. Machine Translation 26(1–2):137–157.
- D. Chiang, 2007. Hierarchical phrase-based translation.
Computational Linguistics 33(2):201–228.
- K. A. Dill, A. Lucas,
J. Hockenmaier, L. Huang, D. Chiang, and A. K. Joshi,
2007. Computational linguistics: a new tool for exploring biopolymer
structures and statistical mechanics. Polymer 48:4289–4300.
- D. Chiang, A. K. Joshi, and D. B. Searls, 2006. Grammatical
representations of macromolecular structure. J. Computational
Biology 13(5):1077–1100.
- D. Chiang, A. K. Joshi, and K. A. Dill, 2006. A grammatical theory for the conformational changes of simple helix bundles. J. Computational Biology 13(1):21–42.
- M. Dras, D. Chiang, and W. Schuler, 2004. On relations of constituency and dependency grammars. Research on Language and Computation 2(2):281–305.
Books and book chapters
- D. Chiang. 2012. Grammars for Language and Genes: Theoretical and Empirical Investigations. Springer.
- D. Chiang. 2003. Statistical parsing with an automatically extracted tree adjoining grammar, In R. Bod et al., editors, Data Oriented Parsing. CSLI Publications, Stanford, pages 299–316.
Refereed conference papers
- D. Chiang, J. Andreas, D. Bauer, K.M. Hermann, B. Jones and K. Knight. 2013. Parsing graphs with hyperedge replacement grammars. Proc. ACL, pages 924–932.
- A. Vaswani, L. Huang, and D. Chiang. 2012. Smaller alignment models for better translations: unsupervised
word alignment with the l0
norm. In Proc. ACL, pages 311–319.
- H. Zhang and D. Chiang. 2012. An exploration of forest-to-string translation: Does translation help or hurt parsing? In Proc. ACL (Vol. 2: Short Papers), pages 317–321.
- A. Vaswani, H. Mi, L. Huang, and D. Chiang. 2011. Rule Markov models for fast tree-to-string translation. In
Proc. ACL, pages 856–864.
- D. Chiang, S. DeNeefe, and M. Pust. 2011. Two easy improvements to lexical weighting. In Proc. ACL (Vol. 2: Short Papers), pages 455–460.
- S. Cai, D. Chiang, and Y. Goldberg. 2011. Language-independent parsing with empty elements. In Proc. ACL (Vol. 2: Short Papers), pages 212–216.
- A. Vaswani, A. Pauls, and D. Chiang, 2010. Efficient optimization of an MDL-inspired objective function for unsupervised part-of-speech tagging. In Proc. ACL, pages 209–214.
- D. Chiang. 2010. Learning to translate with source and target
syntax. In Proc. ACL, pages 1443–1452.
- D. Chiang, J. Graehl, K. Knight, A. Pauls, and S. Ravi.
2010. Bayesian inference for finite-state transducers. In Proc. NAACL HLT, 447–455.
- A. Pauls, D. Klein, D. Chiang, and K. Knight,
2010. Unsupervised syntactic alignment with inversion transduction
grammars. In Proc. NAACL HLT, pages 118–126.
- J. DeNero,
D. Chiang, and K. Knight, 2009. Fast consensus decoding over translation forests. In Proc. ACL.
- D. Chiang, W.
Wang, and K. Knight, 2009. 11,001 new features for statistical machine translation. In Proc. NAACL HLT, pages
218–226. Best paper
award.
- D. Chiang, Y. Marton, and P. Resnik, 2008. Online large-margin training of syntactic and structural translation features. In Proc. EMNLP, pages 224–233.
- D. Chiang, S. DeNeefe, Y. S. Chan, and H. T. Ng, 2008. Decomposability of translation metrics for improved evaluation and efficient algorithms. In Proc. EMNLP, pages 610–619.
- H. Zhang, D. Gildea, and D. Chiang. 2008. Extracting synchronous grammar rules from word-level alignments in linear time. In Proc. COLING.
- L. Huang and D. Chiang, 2007. Forest rescoring: faster decoding with integrated language
models. Proc. ACL, pages 144–151.
- Y. S. Chan, H. T. Ng, and D. Chiang, 2007. Word sense disambiguation improves statistical machine
translation. Proc. ACL, pages 33–40.
- D. Chiang, M. Diab, N. Habash, O. Rambow, and
S. Shareef, 2006. Parsing Arabic dialects. In Proc. EACL, Trento, pages 369–376.
- D. Chiang, A. Lopez, N. Madnani, C. Monz, P. Resnik, and
M. Subotin, 2005. The Hiero machine translation system: extensions, evaluation, and analysis. In Proc. HLT/EMNLP, Vancouver, pages 779–786.
- D. Chiang, 2005. A hierarchical phrase-based model for statistical machine translation. In Proc. ACL, Ann Arbor, MI, pages 263–270. Best paper award.
- D. Chiang, 2003. Mildly context sensitive grammars for estimating maximum entropy models. In Proc. Formal Grammar, Vienna, August. CSLI Publications.
- D. Chiang and D. M. Bikel, 2002. Recovering latent information in treebanks. In Proc. COLING, Taipei, August, pages 183–189.
- D. Chiang and A. K. Joshi, 2002. Formal grammars for estimating partition functions of double-stranded chain molecules. Proc. HLT, San Diego, March, pages 63–67.
- D. Chiang, 2001. Constraints on strong generative power. In Proc. ACL, Toulouse, July, pages 124–131.
- D. Chiang, 2000. Statistical parsing with an automatically-extracted tree adjoining grammar. In Proc. ACL, pages 456–463.
- W. Schuler, D. Chiang, and M. Dras, 2000. Multi-component TAG and notions of formal power. In Proc. ACL, pages 448–455.
Refereed workshop papers
- D. Chiang, 2006. The weak generative capacity of linear tree
adjoining grammars. In Proceedings of the Eighth International
Workshop on Tree Adjoining Grammars and Related Formalisms
(TAG+8), Sydney, July.
- D. Chiang and O. Rambow, 2006. The hidden TAG model: synchronous
grammars for parsing resource-poor languages. In Proceedings of the Eighth International
Workshop on Tree Adjoining Grammars and Related Formalisms
(TAG+8), Sydney, July, pages 1–8.
- L. Huang and D. Chiang, 2005. Better k-best parsing. In Proceedings of the Ninth International Workshop on Parsing Technology, Vancouver, October, pages 53–64.
- D. Chiang, 2004. Uses and abuses of intersected languages. In Proceedings of the Seventh International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+7), Vancouver, May.
- D. Chiang, 2002. Putting some weakly context-free formalisms in order. In Proceedings of the Sixth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+6), Venice, May, pages 11–18.
- D. M. Bikel and D. Chiang, 2000. Two statistical parsing models applied to the Chinese Treebank. In Proceedings of the Second Chinese Language Processing Workshop, Hong Kong, October, pages 1–6.
- M. Dras, D. Chiang, and W. Schuler, 2000. A multi-level TAG approach to dependency. In Proceedings of the ESSLLI-2000 Workshop on Linguistic Theory and Grammar Implementation, Birmingham, UK, August, pages 33–46.
- D. Chiang, W. Schuler, and M. Dras, 2000. Some remarks on an extension of synchronous TAG. In Proceedings of the Fifth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5), Paris, May, pages 61–66.
Other papers
- D. Chiang, K. A. Dill, A. K. Joshi, 2003. Estimating partition functions of helix bundles using context-free grammars. Greater Philadelphia Bioinformatics Alliance Retreat, poster session, 24 October, pages 37–38.
- F.-D. Chiou, D. Chiang, and M. Palmer, 2001. Facilitating treebank annotation using a statistical parser. In Proceedings of the Human Language Technology Conference (HLT 2001), poster session, San Diego, March, pages 117–120.
Invited presentations
- Synchronous Grammars, tutorial given at TAG+11, 26 Sep 2012.
- Hope and fear for discriminative training of statistical translation models. Columbia University, 17 May 2011.
- Learning to translate with source and target syntax. IBM TJ Watson
Research Center, 14 Jun 2010; Carnegie Mellon University, Language
Technologies Institute, 16 Jun 2010.
- Towards tree-to-tree translation. Cambridge University, 1 Feb 2010; University
of Edinburgh, School of Informatics, 5 Feb 2010.
- What can computational linguistics do for computational
stemmatology? Studia Stemmatologica, University of Helsinki, 28 Jan 2010.
- Online large-margin training of syntactic and structural translation features. Johns Hopkins University, Center for Language and Speech Processing, 16 September 2008.
- Microsoft Research Asia, Beijing, 19 and 26 September 2007.
- Harvard University Computer Science Colloquium, 9 November 2006.
- Synchronous Grammars and Tree Transducers, tutorial given at ACL-COLING 2006 with K. Knight, 16 July 2006.
- National University of Singapore, School of Computing, 11 Apr 2006.
- NIPS 2005 Workshop on Advances in Structured Learning for Text and Speech Processing, 9 December 2005.
- NYU Department of Computer Science, 16 September 2005.
- USC Information Sciences Institute, 6 July 2005.
- Google, Inc., 13 June 2005.
- From phrase-based towards syntax-based machine translation. JHU CLSP, 8 Feb 2005.
- Sizing up formal grammars for statistical parsing and translation. Language Weaver, Inc., 3 June 2004.
- Putting formal grammars to work. Univ. of MD Inst. for Advanced Computer Studies, Computational Linguistics Colloquium, 26 April 2004.
- Formal grammars for biological sequence analysis. NYU Department of
Computer Science, 12 April 2002.
- Extracting tree adjoining grammars for statistical parsing of English
and Chinese. Univ. of MD Inst. for Advanced Computer Studies,
Computational Linguistics Colloquium, 15 August 2001.
Grants awarded
- Training Machine Translation Models as Deep Architectures. 2013–2014. Google Faculty Research Award, $84,000.
- Automatic Knowledge Acquisition for Language Translation. 2012–2014. Funded by DARPA at $500,000.
- Machine Translation for Language Preservation. With S. Bird. 2011–2012. Funded by NSF at $180,000.
- TAU: Learning Natural Language Structure using Multilingual
Texts. 2010–2012. Funded by DARPA at $400,000.
- Phylo: Phylogenetic Reconstruction of Textual
Histories. 2010–2011. Funded by NSF at $75,000.
- Structured Learning for Structured Machine
Translation. 2009–2010. Funded by
DARPA at $100,000.
Professional activities
- NAACL executive board, 2011–2012
- Secretary, ACL Special Interest Group on Machine Translation
- Local organizer for NAACL HLT 2010, with E. Hovy, J. May, and J. Riesa
- Editorial board, Computational Linguistics, 2006–2008; Artificial Intelligence Research, 2009–2012; Machine Translation Journal, 2012–present
- Action editor, Transactions of the ACL, 2012–present
- Guest editor, ACM Trans. Asian Language Information Processing Special Issue on Machine Translation
- Area chair: ACL 2006, 2008 (machine translation and
multilinguality); EMNLP 2007, 2010, IJCNLP 2011 (machine translation)
- Publications chair, EMNLP 2009
- Organizer, NAACL Workshop on Syntax and Structure in Statistical Translation, with D. Wu, 2007–2009
- NSF review panelist, 2002, 2010, 2013
- Reviewer, Computational Linguistics, J. Artificial Intelligence Research, J. Natural Language Engineering
- Reviewer: ACL, NAACL, EACL, IJCNLP, EMNLP, NIPS, ICML, EMNLP, IJCAI.
- Local site coordinator for North American Computational Linguistics Olympiad, 2009–present
Other information
- Citizenship: U.S.A.
- Languages: English (native), Mandarin Chinese (basic), ancient Latin and Greek (reading)