The Natural Language Group


Our Github organization hosts the latest list of tools.

English-to-AMR parser. Converts English sentences into Abstract Meaning Representations (AMRs). Built by Michael Pust, Ulf Hermjakob, Kevin Knight, Daniel Marcu, and Jonathan May. (Download size = 719Mb).

AMR-to-English generator. Converts Abstract Meaning Representations (AMR) into English sentences. Built by Nima Pourdamghani.

uroman. Converts texts in any script to Latin alphabet.   Online interface.   Built by Ulf Hermjakob.

Monogiza. Extracts a word-for-word translation table from non-parallel corpora. Built by Qing Dou.

Carmel. Finite-state transducer package for strings, built by Jonathan Graehl.  Latest version on Github.

Tiburon. Finite-state transducer package for trees, built by Jonathan May.

Bolinas. Hyperedge replacement transducer package for graphs, built by Jacob Andreas, Daniel Bauer, David Chiang, Karl Moritz Hermann, Bevan Jones, and Kevin Knight.

Zoph_RNN. GPU-based neural LSTM sequence-to-sequence modeling toolkit, built by Barret Zoph.

EUREKA. CPU-based neural LSTM sequence-to-sequence modeling toolkit, built by Ashish Vaswani.

NPLM. Neural probabilistic language model toolkit, built by Ashish Vaswani, with contributions from David Chiang and Victoria Fossum.

SPADE. Sentence-level Discourse Parser. Built by Radu Soricut.

ReWrite Decoder. Greedy Decoder for IBM SMT Models. Built by Daniel Marcu and Ulrich Germann.


Software demos:

Portmanteau generator. Creates a new word (neologism) from two existing words, built by Aliya Deri.

Poetry password demo and assigner. Shows poems create from randomly-generated 60-bit passwords, built by Marjan Ghazvininejad.

Smatch. Evaluates output of semantic parsing. Built by Shu Cai.

Poetry generator.  Creates a poem on any topic.  Built by Marjan Ghazvininejad, Xing Shi, Yejin Choi, and Kevin Knight.



HowToSpeak. Allows users to speak a language they don't understand, by phonetic rendering. Built by Xing Shi.


Annotation and data collection interfaces:

Shannon Game. Collects character-level text predictions from people, in order to estimate the entropy of translation. Built by Marjan Ghazvininejad.

AMR Editor. Allows human annotators to type in the meanings of English sentences, using the Abstract Meaning Representation framework. Built by Ulf Hermjakob. Video.

RST Annotation Tool. Enables annotators to build Rhetorical Structure Representations for texts. Built by Benjamin Liberman.


Shared tasks:

Bilingual compression challenge. If we exploit the high redundancy of human translated texts, what is the best compression rate we can achieve for bilingual texts? Run by Barret Zoph, Kevin Knight, and Marjan Ghazvininejad.

AMR parsing. This 2016 SemEval challenge asks participants to write software to convert English into Abstract Meaning Representations. Run by Jonathan May.