Jian Zhang - Transfer Talk - 17th June 2015

Video Category: 
Transfer Talk
Title: Domain Adaptation of Statistical Machine Translation

Supervisor: Prof. Qun Liu, Prof. Andy Way


Statistical machine translation is a hard problem because of the natural of human languages. In human languages, the same context information can be described in different genres, topics or styles by using different vocabular- ies. Given statistical machine translation is a data-driven learning approach, it requires large quantity of human translated sentences to train statistical models and produce translation outputs. The variation from human language often results the statistical model in favour to one genres, topics or styles, which is often defined as “domain“ in the literature. In the real world situation, domain specific resources, such as bilingual sentence pairs are often in a sharp demand, but its availability is always low. Also, it is human intensive and time-consuming to enhance such resources. On the other hand, some content rich and domain specific resources have been well studied, its natural language processing applications have been resulted in very positive results. The objective of this research focuses on domain adaptation on SMT.