Products

Current Projects

Our Github organization hosts the latest list of tools.

Code
Demos
APIs
Tools
Data

AMR-to-English generator
Converts Abstract Meaning Representations (AMR) into English sentences. Built by Nima Pourdamghani.
ASTRAPOP
Tool for authorship style transfer. Built by Shuai Liu.
Bolinas
Hyperedge replacement transducer package for graphs, built by Jacob Andreas, Daniel Bauer, David Chiang, Karl Moritz Hermann, Bevan Jones, and Kevin Knight.
Carmel
Finite-state transducer package for strings, built by Jonathan Graehl. Latest version on Github.
English-to-AMR parser
Converts English sentences into Abstract Meaning Representations (AMRs). Built by Michael Pust, Ulf Hermjakob, Kevin Knigh, Daniel Marcu, and Jonathan May (Download size = 719Mb).
EUREKA
CPU-based neural LSTM sequence-to-sequence modeling toolkit, built by Ashish Vaswani.
Monogiza
Extracts a word-for-word translation table from non-parallel corpora. Built by Qing Dou.
MTData
A tool capable of retrieving thousands of parallel datasets for machine translation research. Built by Thamme Gowda.
NLCodec and NLDb
A scalable tool for mapping words, characters, BPE subwords into integer sequences, and a storage layer for efficiently storing and retrieving large scale datasets. Built by Thamme Gowda.
NPLM
Neural probabilistic language model toolkit, built by Ashish Vaswani, with contributions from David Chiang and Victoria Fossum.
Reader Translator Generator (RTG)
A feature rich neural machine translation toolkit based on PyTorch, with focus on reproducible experiments. Buily by Thamme Gowda.
ReWrite Decoder
Greedy Decoder for IBM SMT Models. Built by Daniel Marcu and Ulrich Germann.
SPADE
Sentence-level Discourse Parser. Built by Radu Soricut.
Tiburon
Finite-state transducer package for trees, built by Jonathan May.
uroman
Converts texts in any script to Latin alphabet. Click here to install. Built by Ulf Hermjakob.
utoken
Universal tokenizer, i.e. word segmenter for a wide variety of scripts and languages. Built by Ulf Hermjakob.
Zoph_RNN
GPU-based neural LSTM sequence-to-sequence modeling toolkit, built by Barret Zoph.

BotEval
Web interface to evaluate chat bots, with optional mturk integration. Built by Hyundong (Justin) Cho.
Many-English NMT
A multilingual NMT model that can translate from 500 source languages to English. Built by Thamme Gowda.
Poetry generator
Creates a poem on any topic. Built by Marjan Ghazvininejad, Xing Shi, Yejin Choi, and Kevin Knight.
Poetry password demo and assigner
Shows poems create from randomly-generated 60-bit passwords. Built by Marjan Ghazvininejad.
Portmanteau generator
Creates a new word (neologism) from two existing words. Built by Aliya Deri.
Smatch
Evaluates output of semantic parsing. Built by Shu Cai.
Spolin Bot
Chat with our improvisation bot!

HowToSpeak
Allows users to speak a language they don't understand, by phonetic rendering. Built by Xing Shi.

AMR Editor
Allows human annotators to type in the meanings of English sentences, using the Abstract Meaning Representation framework. Built by Ulf Hermjakob. AMR Editor Overview video.
RST Annotation Tool
Enables annotators to build Rhetorical Structure Representations for texts. Built by Benjamin Liberman.
Shannon Game
Collects character-level text predictions from people, in order to estimate the entropy of translation. Built by Marjan Ghazvininejad.

AMR parsing
This 2016 SemEval challenge asks participants to write software to convert English into Abstract Meaning Representations. Run by Jonathan May.
Bilingual compression challenge
If we exploit the high redundancy of human translated texts, what is the best compression rate we can achieve for bilingual texts? Run by Barret Zoph, Kevin Knight, and Marjan Ghazvininejad.
NewsEdits
A news article revision dataset and a novel document-level reasoning challenge. Built by Alexander Spangher.

Information Sciences Institute

Natural Language Group

Current Projects

Code

AMR-to-English generator

ASTRAPOP

Bolinas

Carmel

English-to-AMR parser

EUREKA

Monogiza

MTData

NLCodec and NLDb

NPLM

Reader Translator Generator (RTG)

ReWrite Decoder

SPADE

Tiburon

uroman

utoken

Zoph_RNN

Demos

BotEval

Many-English NMT

Poetry generator

Poetry password demo and assigner

Portmanteau generator

Smatch

Spolin Bot

APIs

HowToSpeak

Tools

AMR Editor

RST Annotation Tool

Shannon Game

Data

AMR parsing

Bilingual compression challenge

NewsEdits