Research seminar: Jia Xu, Tsinghua University, “Ensemble learning in machine translation”

Oplysninger om arrangementet

Tidspunkt

Mandag 27. oktober 2014, kl. 15:15 - 16:15

Sted

Ada-333

Af Katrine Aakjær Nielsen

Abstract

Ensemble methods are machine learning algorithms that construct a set of classifiers and combine their results in a way that outperforms each of them. In a big data world, Natural Language Processing (NLP) is essential to analyze the data related to human communication. We will study one of the most challenging NLP task, machine translation (multi-label classification) and discuss the bridge between applications and theory on the ensemble learning applied in contemporary statistical machine translation with a focus on two issues: (1) How to produce different classifiers through approaches such as corpus mining (via IR, pivot language and comparable corpus) and bagging; (2) How to combine the results of single classifiers on the level of models and systems, for instance synchronous model training and mixture of experts.