START Conference Manager    

Relation Extraction with Distant Supervision from Incomplete Knowledge Bases

Wei Xu, Raphael Hoffmann, Ralph Grishman and Le Zhao

The 51st Annual Meeting of the Association for Computational Linguistics - Short Papers (ACL Short Papers 2013)
Sofia, Bulgaria, August 4-9, 2013


Abstract

Distant supervision has attracted recent interest for training information extraction systems because it does not require any human annotation but rather employs existing knowledge bases to heuristically label a training corpus. However, previous work has failed to address the problem of false negative training examples mislabeled due to the incompleteness of knowledge bases. To tackle this problem, we propose a simple yet novel framework that combines a passage retrieval model using coarse features into a state-of-the-art relation extractor using multi-instance learning with fine features. We adapt the information retrieval technique of pseudo-relevance feedback to expand knowledge bases, assuming entity pairs in top-ranked passages are more likely to express a relation. Our proposed technique significantly improves the quality of distantly supervised relation extraction, boosting recall from 47.7% to 61.2% with a consistently high level of precision of around 93% in the experiments.


START Conference Manager (V2.61.0 - Rev. 2792M)