Record Details

Models and indices for integrating unstructured data with a relational database

DSpace at IIT Bombay

View Archive Info
 
 
Field Value
 
Title Models and indices for integrating unstructured data with a relational database
 
Creator SARAWAGI, S
 
Description Database systems are islands of structure in a sea of unstructured data sources. Several real-world applications now need to create bridges for smooth integration of semi-structured sources with existing structured databases for seamless querying. This integration requires extracting structured column values from the unstructured source and mapping them to known database entities. Existing methods of data integration do not effectively exploit the wealth of information available in multi-relational entities. We present statistical models for co-reference resolution and information extraction in a database setting. We then go over the performance challenges of training and applying these models efficiently over very large databases. This requires us to break open a black box statistical model and extract predicates over indexable attributes of the database. We show how to extract such predicates for several classification models, including naive Bayes classifiers and support vector machines. We extend these indexing methods for supporting similarity predicates needed during data integration.
 
Publisher SPRINGER-VERLAG BERLIN
 
Date 2011-10-23T17:23:51Z
2011-12-15T09:11:15Z
2011-10-23T17:23:51Z
2011-12-15T09:11:15Z
2005
 
Type Article; Proceedings Paper
 
Identifier KNOWLEDGE DISCOVERY IN INDUCTIVE DATABASES,3377,1-10
3-540-25082-4
0302-9743
http://dspace.library.iitb.ac.in/xmlui/handle/10054/15188
http://hdl.handle.net/100/1953
 
Source 3rd International Workshop Knowledge Discovery in Inductive Databases,Pisa, ITALY,SEP 20, 2004
 
Language English