Record Details

Fast prediction of protein domain boundaries using conserved local patterns

DSpace at IIT Bombay


Field	Value

Title	Fast prediction of protein domain boundaries using conserved local patterns

Creator	JOSHI, RR SAMANT, VV

Subject	assignment propainor protein structures protein domain boundary points nonparametric statistics dgs psipred

Description	We have found certain conserved motifs and secondary structural patterns present in the vicinity of interior domain boundary points (dbps) by a data-driven approach without any a priori constraint on the type and number of such features, and without any requirement of sequence homology. We have used these motifs and patterns to rerank the solutions obtained by the well-known domain guess by size (DGS) algorithm. We predict, overall, five solutions. The average accuracy of overall (i.e., top five) predictions by our method [domain boundary prediction using conserved patterns (DPCP)] has improved the average accuracy of the top five solutions of DGS from 71.74 to 82.88 %, in the case of two-continuous-domain proteins, and from 21.38 to 80.56 %, for two-discontinuous-domain proteins. Considering only the top solution, the gains in accuracy are from 0 to 72.74 % for two-continuous-domain proteins with chain lengths up to 300 residues, and from 0 to 62.85 % for those with up to 400 residues. In the case of discontinuous domains, top_min solutions (the minimum number of solutions required for predicting all dbps of a protein) of DPCP improve the average accuracy of DGS prediction from 12.5 to 76.3 % in proteins with chain lengths up to 300 residues, and from 13.33 to 70.84 % for proteins with up to 400 residues. In our validation experiments, the performance of DPCP was also found to be superior to that of domain identification from secondary structure element alignment (DomSSEA), the best method reported so far for efficient prediction of domain boundaries using predicted secondary structure. The average accuracies of the topmost solution of DomSSEA are 61 and 52 % for proteins with up to 300 residues and 400, respectively, in the case of continuous domains; the corresponding accuracies for the discontinuous case are 28 and 21 %.

Publisher	SPRINGER

Date	2011-08-29T12:54:32Z 2011-12-26T12:58:36Z 2011-12-27T05:48:49Z 2011-08-29T12:54:32Z 2011-12-26T12:58:36Z 2011-12-27T05:48:49Z 2006

Type	Article

Identifier	JOURNAL OF MOLECULAR MODELING, 12(6), 943-952 1610-2940 http://dx.doi.org/10.1007/s00894-006-0116-0 http://dspace.library.iitb.ac.in/xmlui/handle/10054/12089 http://hdl.handle.net/10054/12089

Language	en

ICAR Research Data Repository for Knowledge Management

Record Details

Fast prediction of protein domain boundaries using conserved local patterns

DSpace at IIT Bombay