Aarhus University Seal

Research seminar: Jia Xu, Tsinghua University, “Ensemble learning in machine translation”

Info about event

Time

Monday 27 October 2014,  at 15:15 - 16:15

Location

Ada-333

Abstract

Ensemble methods are machine learning algorithms that construct a set of classifiers and combine their results in a way that outperforms each of them. In a big data world, Natural Language Processing (NLP) is essential to analyze the data related to human communication.  We will study one of the most challenging NLP task, machine translation (multi-label classification) and discuss the bridge between applications and theory on the ensemble learning applied in contemporary statistical machine translation with a focus on two issues: (1) How to produce different classifiers through approaches such as corpus mining (via IR, pivot language and comparable corpus) and bagging; (2) How to combine the results of single classifiers on the level of models and systems, for instance synchronous model training and mixture of experts.