摘要: |
鉴于蛋白质折叠速率预测对研究其蛋白质功能的重要性,许多的科研工作者都开始对影响蛋白质折叠速率的因素进行研究。各种预测参数和方法被提出。利用蛋白质编码序列的不同特征参数,不同的二级结构及不同的折叠类的蛋白质对折叠速率的不同影响,我们选取蛋白质编码序列的新的特征值,即选取蛋白质序列的LZ复杂度,等电点等特征值。然后把这些特征值与20种氨基酸的属性αc、Cα、K0、Pβ、Ra、ΔASA、PI、ΔGhD、Nm、LZ、Mu、El融合,建立多元线性回归模型,并利用回归模型计算了13个全α类蛋白质、18个全β类蛋白质、13个混合类蛋白质和39个未分类蛋白质的ln(kf)与预测值之间的相关系数分别达到0.89、0.93、0.98、0.86。在Jack-knife方法的验证下发现在不同的结构中混合特征值与相应折叠速率有很好的相关性。结果表明,在蛋白质折叠过程中,蛋白质序列的LZ复杂度、等电点等特征值可能影响蛋白质的折叠速率及其结构。 |
关键词: 蛋白质序列 特征值 折叠速率 相关系数 |
DOI:10.3969/j.issn.1672-5565.2014.03.12 |
分类号:Q523 |
基金项目: |
|
The impact on the folding rate by mixed eigenvalues of protein sequences |
LI Xinying,BAI Fenglan
|
(DalianJiaotong University school of science,Dalian 116028,China)
|
Abstract: |
Given the importance of protein folding rate prediction on protein function analysis, many researchers have begun to study the influential factors of protein folding rate. Several prediction parameters and methods have been proposed due to the different eigenvalues of protein-coding sequences, using the different effects on the folding rate by different secondary structures and folding classes of proteins, we selected some characteristic values of protein coding sequence the characteristics of LZ complexity of protein-coding sequence and isoelectric point. Then we integrated these eigenvalues and properties of 20 kinds of amino acids αc,Cα,K0,Pβ,Ra,ΔASA,PI,ΔGhD,Nm,LZ,Mu,El, and established a linear regression model. Using the regression model to calculate the correlation coefficient betweenln(kf) and predicted values for 13 all-α proteins ,18 all-β proteins,13 mixed class proteins,39 unclassified proteins as 0.9,0.3,0.8,0.86, respectively.We used Jack-knife to test our model and it was found that the classification of protein into different structural classes reflected a good correlation between mixed eigenvalues and protein folding rates. The result shows that the characteristics of LZcomplexity of protein-coding sequence and isoelectric point may affect the structure and the protein in the protein folding rate. |
Key words: Protein sequence Eigenvalue Folding rate Correlation coefficient |