Towards Accurate Distant Supervision for Relational Facts Extraction
Xingxing Zhang, Jianwen Zhang, Junyu Zeng, Jun Yan, Zheng Chen and Zhifang Sui
The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013
Distant supervision (DS) is an appealing learning method which learns from the existing relational facts to extract more from a text corpus. However, the accuracy is still not satisfying. In this paper, we point out and analyze some critical factors in DS which have great impact on the accuracy, including valid entity type detection, negative training examples construction, and ensemble. We proposed an approach to handle these factors. By experiments on Wikipedia articles to extract the facts of Freebase (the top 92 relations), we showed the impact of the three factors on the accuracy of DS and the remarkable improvement led by the proposed approach.
Conference Manager (V2.61.0 - Rev. 2792M)