Record Details

MSLVP: prediction of multiple subcellular localization of viral proteins using a support vector machine.

DIR@IMTECH: CSIR-Institute of Microbial Technology

View Archive Info
 
 
Field Value
 
Title MSLVP: prediction of multiple subcellular localization of viral proteins using a support vector machine.
 
Creator Thakur, Anamika
Rajput, Akanksha
Kumar, Manoj
 
Subject QR Microbiology
 
Description Knowledge of the subcellular location (SCL) of viral proteins in the host cell is important for understanding their function in depth. Therefore, we have developed "MSLVP", a two-tier prediction algorithm for predicting multiple SCLs of viral proteins. For this study, data sets of comprehensive viral proteins with experimentally validated SCL annotation were collected from UniProt. Non-redundant (90%) data sets of 3480 viral proteins that belonged to single (2715), double (391) and multiple (374) sites were employed. Additionally, 1687 (30% sequence identity) viral proteins were categorised into single (1366), double (167) and multiple (154) sites. Single, double and multiple locations further comprised of eight, four and six categories, respectively. Viral protein locations include the nucleus, cytoplasm, endoplasmic reticulum, extracellular, single-pass membrane, multi-pass membrane, capsid, remaining others and combinations thereof. Support vector machine based models were developed using sequence features like amino acid composition, dipeptide composition, physicochemical properties and their hybrids. We have employed "one-versus-one" as well as "one-versus-other" strategies for multiclass classification. The performance of "one-versus-one" is better than the "one-versus-other" approach during 10-fold cross-validation. For the 90% data set, we achieved an accuracy, a Matthew's correlation coefficient (MCC) and a receiver operating characteristic (ROC) of 99.99%, 1.00, 1.00; 100.00%, 1.00, 1.00 and 99.90%; 1.00, 1.00 for single, double and multiple locations, respectively. Similar results were achieved for a 30% sequence identity data set. Predictive models for each SCL performed equally well on the independent dataset. The MSLVP web server () can predict subcellular locations i.e. single (8; including single and multi-pass membrane), double (4) and multiple (6). This would be helpful for elucidating the functional annotation of viral proteins and potential drug targets.
 
Publisher RSC
 
Date 2016-07-19
 
Type Article
PeerReviewed
 
Format application/pdf
 
Identifier http://crdd.osdd.net/open/1918/1/c6mb00241b.pdf
Thakur, Anamika and Rajput, Akanksha and Kumar, Manoj (2016) MSLVP: prediction of multiple subcellular localization of viral proteins using a support vector machine. Molecular bioSystems, 12 (8). pp. 2572-86. ISSN 1742-2051
 
Relation http://pubs.rsc.org/en/Content/ArticleLanding/2016/MB/C6MB00241B#!divAbstract
http://crdd.osdd.net/open/1918/