Record Details

Multiple outlier detection in multivariate data using self-organizing maps title

DSpace at IIT Bombay

View Archive Info
 
 
Field Value
 
Title Multiple outlier detection in multivariate data using self-organizing maps title
 
Creator NAG, AK
MITRA, A
MITRA, S
 
Subject dispersion matrices
robust estimation
leverage points
high dimension
s-estimators
location
identification
covariance
scatter
shape
artificial intelligence
minimum covariance determinant
minimum volume ellipsoid
multivariate outliers
robust estimation
self-organizing maps
unified distance matrix
 
Description The problem of detection of multidimensional outliers is a fundamental and important problem in applied statistics. The unreliability of multivariate outlier detection techniques such as Mahalanobis distance and hat matrix leverage has led to development of techniques which have been known in the statistical community for well over a decade. The literature on this subject is vast and growing. In this paper, we propose to use the artificial intelligence technique of self-organizing map (SOM) for detecting multiple outliers in multidimensional datasets. SOM, which produces a topology-preserving mapping of the multidimensional data cloud onto lower dimensional visualizable plane, provides an easy way of detection of multidimensional outliers in the data, at respective levels of leverage. The proposed SOM based method for outlier detection not only identifies the multidimensional outliers, it actually provides information about the entire outlier neighbourhood. Being an artificial intelligence technique, SOM based outlier detection technique is non-parametric and can be used to detect outliers from very large multidimensional datasets. The method is applied to detect outliers from varied types of simulated multivariate datasets, a benchmark dataset and also to real life cheque processing dataset. The results show that SOM can effectively be used as a useful technique for multidimensional outlier detection.
 
Publisher PHYSICA-VERLAG GMBH & CO
 
Date 2011-08-27T04:30:48Z
2011-12-26T12:57:43Z
2011-12-27T05:44:59Z
2011-08-27T04:30:48Z
2011-12-26T12:57:43Z
2011-12-27T05:44:59Z
2005
 
Type Article
 
Identifier COMPUTATIONAL STATISTICS, 20(2), 245-264
0943-4062
http://dx.doi.org/10.1007/BF02789702
http://dspace.library.iitb.ac.in/xmlui/handle/10054/11541
http://hdl.handle.net/10054/11541
 
Language en