Record Details

Design, Development and Implementation of Novel in silico Data Mining, Clustering and Visualization Tools for Comparative Genome and Proteome Analysis

EPrints@IICB


Field	Value

Title	Design, Development and Implementation of Novel in silico Data Mining, Clustering and Visualization Tools for Comparative Genome and Proteome Analysis

Creator	Bag, Sumit K

Subject	Structural Biology & Bioinformatics

Description	If one recited the human genome sequence at the rate of 5 bases per second for 24 hours a day, it would take one about 19 years to recite the book of life. This is only taking into account 3 billion bases of the genome sequence itself, excluding data annotations and other information associated with sequence data. As of April 15, 2012, GenBank release 189.0 has 139,266,481,398 bases from 151,824,421 reported sequences - nearly 47 times the human genome sequence content (GenBank release note - ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt). The number of sequenced genomes is presently growing at an unprecedented pace and there should be no reason to expect it to slow down in the future. On the contrary, the introduction of genomics at a massive scale with „second-generation sequencing technologies‟ such as 454, Solexa or Solid adds to this expectation (Mardis 2008), as illustrated by the recent genome sequencing of a single individual. Importantly, these new technologies promise to be very useful for genome analysis in non-model organisms (Ellegren 2008; Hudson 2008; Vera et al. 2008). Moreover, the 1000 genomes project will create a new map of genetic variation for our genome. Other projects are helping to catalogue genes involved in cancer, alternative splicing in different tissues and transcription factor binding, for example. The evolution of „omic‟ science through microarray transcriptomics, metabolomics, proteomics, etc. is generating this huge data. This data have begun to revolutionize genomics and their effects are becoming increasingly widespread. There has been a vital need to warehouse, harness, disseminate, analyze and interpret this torrents of data, not only to quench the academic thirst of the researchers, but also for medical diagnostic and therapeutic uses and several other biotechnological applications. For this, use of computational approaches was inevitable and thus emerged Bioinformatics – a new field of scientific enquiry that represents a confluence of biology, computer science and information technology.

Date	2012

Type	Thesis NonPeerReviewed

Format	application/pdf

Identifier	http://www.eprints.iicb.res.in/1470/1/SUMIT_KUMAR_BAG_Thesis_2012.PDF Bag, Sumit K (2012) Design, Development and Implementation of Novel in silico Data Mining, Clustering and Visualization Tools for Comparative Genome and Proteome Analysis. PhD thesis, Jadavpur University.

Relation	http://www.eprints.iicb.res.in/1470/

ICAR Research Data Repository for Knowledge Management

Record Details

Design, Development and Implementation of Novel in silico Data Mining, Clustering and Visualization Tools for Comparative Genome and Proteome Analysis

EPrints@IICB