Comparative performance of imputation methods for different proportions of missing data in classification of crop genotypes

Samarendra Das; Amrit Kumar Paul; S.D. Wahi; U.K. Pradhan

KRISHI

ICAR RESEARCH DATA REPOSITORY FOR KNOWLEDGE MANAGEMENT
(An Institutional Publication and Data Inventory Repository)

"Not Available": Please do not remove the default option "Not Available" for the fields where metadata information is not available
"1001-01-01": Date not available or not applicable for filling metadata infromation

Please use this identifier to cite or link to this item: http://krishi.icar.gov.in/jspui/handle/123456789/43020

Title:	Comparative performance of imputation methods for different proportions of missing data in classification of crop genotypes
Other Titles:	Not Available
Authors:	Samarendra Das Amrit Kumar Paul S.D. Wahi U.K. Pradhan
ICAR Data Use Licennce:	http://krishi.icar.gov.in/PDF/ICAR_Data_Use_Licence.pdf
Author's Affiliated institute:	ICAR::Indian Agricultural Statistics Research Institute
Published/ Complete Date:	2017-04-18
Project Code:	Not Available
Keywords:	Missing values Genotypes Classification Mean imputation Regression imputation Multiple imputation Hit ratio
Publisher:	Indian Society of Agricultural Statistics
Citation:	Das, S, Paul, A.K., Wahi, S.D., Pradhan, U.K (2017). Comparative performance of imputation methods for different proportions of missing data in classification of crop genotypes. Journal of the Indian Society of Agricultural Statistics, 71(2): 147–153
Series/Report no.:	Not Available;
Abstract/Description:	Most crop datasets contain missing values, a fact which can cause severe problems in the analysis and limit the utility of resulting inference. Classification techniques for grouping of crop genotypes are used when the data is complete. However, the presence of missing values limits the utility of these techniques and creates bias in the resulting inferences. In majority of the cases, missing values are handled by deleting the genotype or traits which contain missing values there by losing information on these genotypes. An interesting approach to handle this problem is to impute the missing values. In this paper, we provided some solutions to handle missing data in crop breeding experiments for classification of crop genotypes.The performance of the imputation techniques is assessed by using the hit ratio criteria computed through four different classifiers by using extensive simulation procedure. This paper has also attempted to provide a description of missing data mechanism in agricultural experiments and various imputation techniques for missing data analysis in classification problems. For lower proportions of missing data, all four of the imputation techniques provided satisfactory results for classification of crop genotypes. For moderate level of missingness in the data, regression and multiple imputation techniques provided same levels of precision for classification of crop genotypes. When there is a high proportion of missing data, multiple imputation technique outperformed all imputation techniques for classification of crop genotypes. Among the classifiers, k-th nearest neighbor is the best classification technique in missing data situations.
Description:	Not Available
ISSN:	0019-6363
Type(s) of content:	Research Paper
Sponsors:	Not Available
Language:	English
Name of Journal:	Journal of the Indian Society of Agricultural Statistics
NAAS Rating:	5.51
Volume No.:	71(2)
Page Number:	147–153
Name of the Division/Regional Station:	Statistical Genetics
Source, DOI or any other URL:	Not Available
URI:	http://krishi.icar.gov.in/jspui/handle/123456789/43020
Appears in Collections:	AEdu-IASRI-Publication

Files in This Item:

File	Description	Size	Format
isas.pdf		430.34 kB	Adobe PDF	View/Open

Show full item record

KRISHI

ICAR RESEARCH DATA REPOSITORY FOR KNOWLEDGE MANAGEMENT (An Institutional Publication and Data Inventory Repository)

ICAR RESEARCH DATA REPOSITORY FOR KNOWLEDGE MANAGEMENT
(An Institutional Publication and Data Inventory Repository)