KRISHI
ICAR RESEARCH DATA REPOSITORY FOR KNOWLEDGE MANAGEMENT
(An Institutional Publication and Data Inventory Repository)
"Not Available": Please do not remove the default option "Not Available" for the fields where metadata information is not available
"1001-01-01": Date not available or not applicable for filling metadata infromation
"1001-01-01": Date not available or not applicable for filling metadata infromation
Please use this identifier to cite or link to this item:
http://krishi.icar.gov.in/jspui/handle/123456789/76967
Title: | A comparative analysis of amino acid encoding schemes for the prediction of flexible length linear B-cell epitopes |
Other Titles: | Not Available |
Authors: | Tanmaya Kumar Sahu Prabina Kumar Meher Nalini Kanta Choudhury Atmakuri Ramakrishna Rao |
ICAR Data Use Licennce: | http://krishi.icar.gov.in/PDF/ICAR_Data_Use_Licence.pdf |
Author's Affiliated institute: | ICAR::Indian Agricultural Statistics Research Institute |
Published/ Complete Date: | 2022-08-23 |
Project Code: | Not Available |
Keywords: | epitope prediction machine learning peptide encoding random forest vaccine designing linear B-cell epitopes |
Publisher: | Briefings in Bioinformatics |
Citation: | Not Available |
Series/Report no.: | Not Available; |
Abstract/Description: | Linear B-cell epitopes have a prominent role in the development of peptide-based vaccines and disease diagnosis. High variability in the length of these epitopes is a major reason for low accuracy in their prediction. Most of the B-cell epitope prediction methods considered fixed length of epitope sequences and achieved good accuracy. Though a number of tools are available for the prediction of flexible length linear B-cell epitopes with reasonable accuracy, further improvement in the prediction performance is still expected. Thus, here we made an attempt to analyze the performance of machine learning approaches (MLA) with 18 different amino acid encoding schemes in the prediction of flexible length linear B-cell epitopes. We considered B-cell epitope sequences of variable lengths (11–56 amino acids) from well-established public resources. The performances of machine learning algorithms with the encoded epitope sequence datasets were evaluated. Besides, the feasible combinations of encoding schemes were also explored and analyzed. The results revealed that amino-acid composition (AC) and distribution component of composition–transition–distribution encoding schemes are suitable for heterogeneous epitope data, whereas amino-acid-anchoring-pair-composition (APC), dipeptide-composition and amino-acids-pair-propensity-scale (APP) are more appropriate for homogeneous data. Further, two combinations of peptide encoding schemes, i.e. APC + AC and APC + APP with random forest classifier were identified to have improved performance over the state-of-the-art tools for flexible length linear B-cell epitope prediction. The study also revealed better performance of random forest over other considered MLAs in the prediction of flexible length linear B-cell epitopes. |
Description: | Not Available |
ISSN: | Not Available |
Type(s) of content: | Research Paper |
Sponsors: | Not Available |
Language: | English |
Name of Journal: | Briefings in Bioinformatics |
Impact Factor: | 13.99 |
Volume No.: | 23(5) |
Page Number: | bbac356 |
Name of the Division/Regional Station: | Not Available |
Source, DOI or any other URL: | https://doi.org/10.1093/bib/bbac356 |
URI: | http://krishi.icar.gov.in/jspui/handle/123456789/76967 |
Appears in Collections: | AEdu-IASRI-Publication |
Files in This Item:
There are no files associated with this item.
Items in KRISHI are protected by copyright, with all rights reserved, unless otherwise indicated.