KRISHI
ICAR RESEARCH DATA REPOSITORY FOR KNOWLEDGE MANAGEMENT
(An Institutional Publication and Data Inventory Repository)
"Not Available": Please do not remove the default option "Not Available" for the fields where metadata information is not available
"1001-01-01": Date not available or not applicable for filling metadata infromation
"1001-01-01": Date not available or not applicable for filling metadata infromation
Please use this identifier to cite or link to this item:
http://krishi.icar.gov.in/jspui/handle/123456789/76502
Title: | MetaConClust - Unsupervised Binning of Metagenomics Data Using Consensus Clustering |
Other Titles: | Not Available |
Authors: | Sinha Dipro Sharma Anu Mishra Dwijesh Chandra Rai Anil Lal Shashi Bhushan Kumar Sanjeev Farooqi Md. Samir K. K. Chaturvedi |
ICAR Data Use Licennce: | http://krishi.icar.gov.in/PDF/ICAR_Data_Use_Licence.pdf |
Author's Affiliated institute: | ICAR::Indian Agricultural Statistics Research Institute ICAR::Indian Agricultural Research Institute |
Published/ Complete Date: | 2022-04-29 |
Project Code: | Not Available |
Keywords: | Binning; PAM; consensus clustering; coverage; metagenomics; unsupervised clustering |
Publisher: | Not Available |
Citation: | Not Available |
Series/Report no.: | Not Available; |
Abstract/Description: | Background: Binning of metagenomic reads is an active area of research, and many unsupervised machine learning-based techniques have been used for taxonomic independent binning of metagenomic reads. Objective: It is important to find the optimum number of the cluster as well as develop an efficient pipeline for deciphering the complexity of the microbial genome. Methods: Applying unsupervised clustering techniques for binning requires finding the optimal number of clusters beforehand and is observed to be a difficult task. This paper describes a novel method, MetaConClust, using coverage information for grouping of contigs and automatically finding the optimal number of clusters for binning of metagenomics data using a consensus-based clustering approach. The coverage of contigs in a metagenomics sample has been observed to be directly proportional to the abundance of species in the sample and is used for grouping of data in the first phase by MetaConClust. The Partitioning Around Medoid (PAM) method is used for clustering in the second phase for generating bins with the initial number of clusters determined automatically through a consensus- based method. Results: Finally, the quality of the obtained bins is tested using silhouette index, rand Index, recall, precision, and accuracy. Performance of MetaConClust is compared with recent methods and tools using benchmarked low complexity simulated and real metagenomic datasets and is found better for unsupervised and comparable for hybrid methods. |
Description: | Not Available |
ISSN: | Not Available |
Type(s) of content: | Research Paper |
Sponsors: | Not Available |
Language: | English |
Name of Journal: | Current Genomics |
Journal Type: | NAAS Journal |
NAAS Rating: | 8.24 |
Impact Factor: | 2.24 |
Volume No.: | 23(2) |
Page Number: | 137-146 |
Name of the Division/Regional Station: | Not Available |
Source, DOI or any other URL: | https://doi.org/10.2174/1389202923666220413114659 |
URI: | http://krishi.icar.gov.in/jspui/handle/123456789/76502 |
Appears in Collections: | AEdu-IASRI-Publication |
Files in This Item:
There are no files associated with this item.
Items in KRISHI are protected by copyright, with all rights reserved, unless otherwise indicated.