Research seminar: Jia Xu, Tsinghua University, “Ensemble learning in machine translation”
Oplysninger om arrangementet
Tidspunkt
Sted
Ada-333
Abstract
Ensemble methods are machine learning algorithms that construct a set of classifiers and combine their results in a way that outperforms each of them. In a big data world, Natural Language Processing (NLP) is essential to analyze the data related to human communication. We will study one of the most challenging NLP task, machine translation (multi-label classification) and discuss the bridge between applications and theory on the ensemble learning applied in contemporary statistical machine translation with a focus on two issues: (1) How to produce different classifiers through approaches such as corpus mining (via IR, pivot language and comparable corpus) and bagging; (2) How to combine the results of single classifiers on the level of models and systems, for instance synchronous model training and mixture of experts.