Bootstrapping Entity Translation on Weakly Comparable Corpora

Taesung Lee and Seung-won Hwang

The 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013)
Sofia, Bulgaria, August 4-9, 2013


This paper studies the problem of mining named entity translations from comparable corpora with some "asymmetry". Unlike the previous approaches relying on the "symmetry" found in parallel corpora, the proposed method is tolerant to asymmetry often found in comparable corpora, by distinguishing different semantics of relations of entity pairs to selectively propagate seed entity translations on weakly comparable corpora. Our experimental results on English-Chinese corpora show that our selective propagation approach outperforms the previous approaches in named entity translation in terms of the mean reciprocal rank by up to 0.16 for organization names, and 0.14 in a low comparability case.

