Seminars and Events

ISI Natural Language Seminar

Translating faster than a keystroke and dumpster diving for training data

Event Details

Reminder: Meeting hosts only admit guests that they know to the Zoom meeting. Hence, you’re highly encouraged to use your USC account to sign into Zoom.

If you’re an outside visitor, please inform us at (nlg-seminar-host(at)isi.edu) beforehand so we’ll be aware of your attendance and let you in.

For more information on the NL Seminar series and upcoming talks, please visit:

https://nlg.isi.edu/nl-seminar/ 

Machine translation has a deserved reputation for computational cost.  But by burning even more GPU time upfront, we can make inference fast enough to translate thousands of words per second on a desktop or a sentence in under 10 ms on one CPU core.  I will talk about optimizations from chopping off transformer heads to writing assembly that make this possible.  Software is available at translatelocally.com and coming soon as a Firefox extension.  Fast translation was also useful for the ParaCrawl project, where we went dumpster diving on the web for translations and found a few COMET/BLEU points.

Speaker Bio

Kenneth Heafield is a Reader (that's Associate Professor in en-US) at the University of Edinburgh working on fast and often good machine translation.  He coordinates the Bergamot project adding local translation to Firefox, ran the ParaCrawl project, and was friendly competition with ISI in MATERIAL.  He wrote kenlm to do large language models before they were cool.