
- Office
- Information Sciences Institute 911
- Mailing address
- USC Information Sciences Institute
4676 Admiralty Way, Suite 1001
Marina del Rey, CA 90292
- Phone
- (O) +1-310-448-9341
- (F) +1-310-822-0751
- E-mail
- my last name @isi.edu
Computer Scientist (ISI)
Research Assistant Professor (USC Computer Science)
I am interested in statistical machine translation, particularly
applications of synchronous grammars to statistical MT. During my PhD
work at the University of Pennsylvania under Aravind Joshi, I worked
on tree-adjoining grammars and related formalisms; stochastic TAG
parsing models; synchronous grammars; and formal grammars applied to
RNA and protein structure prediction. Curriculum
vitae
Please don't spell my name as "Chang", but please do pronounce it that way!
Current and recent activities:
Selected publications
Synchronous grammars and their applications
- Hierarchical phrase-based translation. 2007.
Computational Linguistics 33(2):201–228. [PDF] Expands on the ACL 2005 paper
below with a description, including experimental results, of the
decoding algorithm ("cube pruning").
- Forest rescoring: faster decoding with integrated language
models. Liang Huang and David Chiang, 2007. Proc. ACL. [PDF]
- Word sense disambiguation improves statistical machine
translation. Yee Seng Chan, Hwee Tou Ng, and David Chiang, 2007.
Proc. ACL. [PDF]
- An introduction to synchronous grammars: part of a tutorial given at ACL 2006 with Kevin Knight. Notes: [PDF] Slides: [PDF]
- The hidden TAG model: synchronous grammars for parsing
resource-poor languages. 2006. In Proc. TAG+8, pages 1–8. With Owen Rambow. [PDF]
- A hierarchical phrase-based model for statistical machine translation. 2005. In Proc. ACL, pages 263–270. Best paper award. [PDF]
- Some remarks on an extension of synchronous TAG. With William Schuler and Mark Dras. 2000. In Proc. TAG+5, pages 61–66. [PS] [PDF]
Statistical parsing
- Parsing Arabic dialects. 2006. In Proc. EACL. With
Mona Diab, Nizar Habash, Owen Rambow, and Safiullah Shareef. [PDF]
- Better k-best parsing. 2005. In Proc. IWPT. Liang Huang and David Chiang. [PDF]
- Recovering latent information in treebanks. 2002. With Daniel M. Bikel. In Proc. COLING 2002, pages 183–189. [PS] [PDF]
- Statistical parsing with an automatically extracted tree adjoining grammar. 2003. In Data Oriented Parsing, CSLI Publications, pages 299–316. Combines and updates results from two papers, from ACL 2000 and the colocated Chinese workshop. [PS] [PDF]
Biological sequence analysis
- Computational linguistics: a new tool for exploring biopolymer
structures and statistical mechanics. 2007. Ken A. Dill, Adam Lucas,
Julia Hockenmaier, Liang Huang, David Chiang, and Aravind K.
Joshi. Polymer 48:4289–4300.
- Grammatical representations of macromolecular
structure. 2006. J. Computational Biology 13(5):1077–1100. With Aravind
K. Joshi and David B. Searls.
- A grammatical theory for the conformational changes of simple
helix bundles. 2006. J. Computational Biology 13(1):21–42. With Aravind
K. Joshi and Ken A. Dill.
Formal grammars in general and their applications
- The weak generative capacity of linear tree adjoining grammars. 2006. In Proc. TAG+8. [PDF]
- Evaluation of Grammar Formalisms for Applications to Natural
Language Processing and Biological Sequence Analysis, PhD
dissertation, University of Pennsylvania, 2004. Rubinoff Award. [PDF, 600k] Slides from defense: [PDF]
- Uses and abuses of intersected languages. 2004. In Proc. TAG+7, pages 9–15. [PS] [PDF]
- Putting some weakly context-free formalisms in order. 2002. In Proc. TAG+6, pages 11–18. [PS] [PDF]
Software and other stuff