Record Details

Collective annotation of Wikipedia entities in web text

DSpace at IIT Bombay

View Archive Info
 
 
Field Value
 
Title Collective annotation of Wikipedia entities in web text
 
Creator KULKARNI, S
SINGH, A
RAMAKRISHNAN, G
CHAKRABARTI, S
 
Subject entity annotation/disambiguation
wikipedia
collective inference
 
Description To take the first step beyond keyword-based search toward entity-based search, suitable token spans ("spots") on documents must be identified as references to real-world entities from an entity catalog. Several systems have been proposed to link spots on Web pages to entities in Wikipedia. They axe largely based on local compatibility between the text around the spot and textual metadata associated with the entity. Two recent systems exploit inter-label dependencies, but in limited ways. We propose a general collective disambiguation approach. Our premise is that coherent documents refer to entities front one or a few related topics or domains. We give formulations for the trade-off between local spot-to-entity compatibility and measures of global coherence between entities. Optimizing the overall entity assignment is NP-hard. We investigate practical solutions based on local hill-climbing, rounding integer linear programs, and pre-clustering entities followed by local optimization within clusters. In experiments involving over a hundred manually-annotated Web pages and tens of thousands of spots, our approaches significantly outperform recently-proposed algorithms.
 
Publisher ASSOC COMPUTING MACHINERY
 
Date 2011-10-25T18:34:08Z
2011-12-15T09:12:04Z
2011-10-25T18:34:08Z
2011-12-15T09:12:04Z
2009
 
Type Proceedings Paper
 
Identifier KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING,457-465
978-1-60558-495-9
http://dspace.library.iitb.ac.in/xmlui/handle/10054/15789
http://hdl.handle.net/100/2454
 
Source 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,Paris, FRANCE,JUN 28-JUL 01, 2009
 
Language English