More than the sum of their parts: Translating idioms without destroying their meaning

Thursday, September 5, 2019, 11:00 am - 12:00 pm PDTiCal
CR# 689
This event is open to the public.
NL Seminar
Denis Emelin and Prince Wang
Video Recording:

Abstract: Translating idioms is hard. As low-frequency linguistic events with a non-compositional meaning, idiomatic expressions are at odds with contemporary neural machine translation methods. Accordingly, the literal translation of idiomatic phrases which fails to preserve their semantic content represents an often observed failure case in NMT models. To facilitate future work on idiom translation, the current project sets out to compile a large-coverage, multilingual corpus of parallel sentences containing idiomatic expressions, augmented with their respective monolingual definitions. With this resource in hand, we next aim to propose models which can effectively exploit idiom definitions to avoid literal translation errors. As part of the evaluation of the constructed corpus, we demonstrate that idioms continue to pose a veritable challenge for state-of-the-art NMT models.

Bio: Denis is a second-year PhD candidate at the University of Edinburgh, advised by Dr. Rico Sennrich. His background is in machine translation, natural language understanding, and linguistics.

« Return to Upcoming Events