Non-Monotonic Sentence Alignment via Semisupervised Learning
Xiaojun Quan, Chunyu Kit and Yan Song
The 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013)
Sofia, Bulgaria, August 4-9, 2013
This paper studies the problem of non-monotonic sentence alignment, motivated by the observation that coupled sentences in real bitexts do not necessarily occur monotonically, and proposes a semisupervised learning approach based on two assumptions: (1) sentences with high affinity in one language tend to have their counterparts with similar relatedness in the other; and (2) initial alignment is readily available with existing alignment techniques. They are incorporated as two constraints into a semisupervised learning framework for optimization to produce a globally optimal solution. The evaluation with real-world legal data from a comprehensive legislation corpus shows that while existing alignment algorithms suffer severely from non-monotonicity, this approach can work effectively on both monotonic and non-monotonic data.
Conference Manager (V2.61.0 - Rev. 2792M)