SMBI & RSG-Morocco organize a conference
on Tuesday 23th September 2008 at the ENSAT, Tangier
from 10h to 12h

An integrated approach to structural and functional motif characterisation in proteins

the conference will be presented by Dr. Manuel Corpas
From The Sanger Institute, Cambridge, UK

 

 

Dr. Manuel Corpas
The Sanger Institute, Cambridge, UK

 

 

 

 

 

 

 

 

Abstract: Despite decades of work, understanding protein folding remains a major research challenge. The main fruits of this massive research effort have been development of: (i) methods for predicting the likely structures that protein sequences will adopt, or for
simulat-ing the folding process itself; and (ii) databases of structural informa-tion (e.g., containing 3D coordinates, fold classifications, structure summary data, and so on). As part of the ongoing endeavour to understand the principles of protein folding, we have performed an in-depth analysis of a diverse dataset of folding features, based on a small subset of the PDB (Berman et al., 2000).
The motivation for combining data from many approaches is to offer insights into the role of particular types of residues and fragments in protein folding, and hence to improve our understanding of factors that are critical to the folding process in general. Results: From an initial analysis of the data, we found that certain results were strongly correlated: e.g., residue accessibility values (denoting the degree of internal constraint on flexibility), Fold-X
scores (denoting the stabilising contributions to the fold), PoPMuSiC values (denoting destabilising contributions), and lattice simulations (denoting the number of close neighbours or interaction partners within the fold). We used these values to synthesise a 'folding score' reflecting the contribution of folding factors to sequence conserva-tion. The folding score provides a method for systematically annotating conserved regions, based on their folding factors, discriminating between regions conserved for structural or functional reasons. The folding score was validated assessing its ability to discriminate sig-nificantly residues known to be conserved for structural reasons (e.g., Topohydrophobic residues) from poorly conserved ones. As a side effect, we found that regions with poor folding scores and high conservation tend to correspond to functional sites. Moreover, an improved homology detection of distant sequences can be achieved when this methodology is applied to family sequence classification