Record Details

Text classification with evolving label-sets

DSpace at IIT Bombay

View Archive Info
 
 
Field Value
 
Title Text classification with evolving label-sets
 
Creator GODBOLE, S
RAMAKRISHNAN, G
SARAWAGI, S
 
Description We introduce the evolving label-set problem encountered in building real-world text classification systems. This problem arises when a text classification system trained on a label-set encounters documents of unseen classes at deployment time. We design a Class-Detector module that monitors unlabeled data, detects new classes, and suggests them to the administrator for inclusion in flit, label-set. We propose abstractions that group together tokens under human understandable concepts and provide a mechanism of assigning importance to unseen terms. We present generative algorithms leveraging the notion of support of documents in a model for (1) selecting documents of proposed new classes, and (2) automatically triggering detection of new classes. Experiments on three real world taxonomies show that our methods select new class documents with high precision, and trigger emergence of new classes with low false-positive and false-negative rates.
 
Publisher IEEE COMPUTER SOC
 
Date 2011-10-24T14:43:54Z
2011-12-15T09:11:39Z
2011-10-24T14:43:54Z
2011-12-15T09:11:39Z
2005
 
Type Proceedings Paper
 
Identifier FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS,629-632
0-7695-2278-5
http://dx.doi.org/10.1109/ICDM.2005.143
http://dspace.library.iitb.ac.in/xmlui/handle/10054/15445
http://hdl.handle.net/100/2206
 
Source 5th IEEE International Conference on Data Mining,Houston, TX,NOV 27-30, 2005
 
Language English