TY - GEN
T1 - High-performance unsupervised relation extraction from large corpora
AU - Rozenfeld, Binjamin
AU - Feldman, Ronen
PY - 2006
Y1 - 2006
N2 - We present URIES - an Unsupervised Relation Identification and Extraction system. The system automatically identifies interesting binary relations between entities in the input corpus, and then proceeds to extract a large number of instances of these relations. The system discovers relations by clustering frequently cooccuring pairs of entities, based on the contexts in which they appear. Its complex pattern-based representation of the contexts allows the clustering step to achieve very high precision, sufficient for the clusters to perform as sets of seeds for bootstrapping a high-recall relation extraction process. In a series of experiments we demonstrate the successful performance of URIES and compare it to the two existing systems - a weakly supervised high-recall Web relation extraction system called SRES, and an unsupervised relation identification system that uses a simpler bag-of-words representation of contexts. The experiments show that URIES performs comparably to SRES, but without any supervision, and that such performance is due to the power of its complex contexts representation and to its novel candidate selection method.
AB - We present URIES - an Unsupervised Relation Identification and Extraction system. The system automatically identifies interesting binary relations between entities in the input corpus, and then proceeds to extract a large number of instances of these relations. The system discovers relations by clustering frequently cooccuring pairs of entities, based on the contexts in which they appear. Its complex pattern-based representation of the contexts allows the clustering step to achieve very high precision, sufficient for the clusters to perform as sets of seeds for bootstrapping a high-recall relation extraction process. In a series of experiments we demonstrate the successful performance of URIES and compare it to the two existing systems - a weakly supervised high-recall Web relation extraction system called SRES, and an unsupervised relation identification system that uses a simpler bag-of-words representation of contexts. The experiments show that URIES performs comparably to SRES, but without any supervision, and that such performance is due to the power of its complex contexts representation and to its novel candidate selection method.
UR - http://www.scopus.com/inward/record.url?scp=72849107691&partnerID=8YFLogxK
U2 - 10.1109/icdm.2006.82
DO - 10.1109/icdm.2006.82
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:72849107691
SN - 0769527019
SN - 9780769527017
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 1032
EP - 1037
BT - Proceedings - Sixth International Conference on Data Mining, ICDM 2006
T2 - 6th International Conference on Data Mining, ICDM 2006
Y2 - 18 December 2006 through 22 December 2006
ER -