摘要: |
蛋白质二级结构预测是蛋白质结构研究的一个重要环节,大量的新预测方法被提出的同时,也不断有新的蛋白质二级结构预测服务器出现。试验选取7种目前常用的蛋白质二级结构预测服务器:PSRSM、SPOT-1D、MUFOLD、Spider3、RaptorX,Psipred和Jpred4,对它们进行了使用方法的介绍和预测效果的评估。随机选取了PDB在2018年8月至11月份发布的180条蛋白质作为测试集,评估角度为:Q3、Sov、边界识别率、内部识别率、转角C识别率,折叠E识别率和螺旋H识别率七种角度。上述服务器180条测试数据的Q3结果分别为:89.96%、88.18%、86.74%、85.77%、83.61%,79.72%和78.29%。结果表明PSRSM的预测结果最好。180条测试集中,以同源性30%,40%,70%分类的实验结果中,PSRSM的Q3结果分别为:89.49%、90.53%、89.87%,均优于其他服务器。实验结果表明,蛋白质二级结构预测可从结合多种深度学习方法以及使用大数据训练模型方向做进一步的研究。 |
关键词: 蛋白质 蛋白质二级结构预测 PSRSM 预测方法评估 |
DOI:10.12113/201907006 |
分类号:Q518.1 |
文献标识码:A |
基金项目:国家自然科学基金(No.61375013);山东省自然科学基金(No.ZR2013FM020)资助. |
|
Protein secondary structure prediction Server PSRSM |
HAN Xinyi, LIU Yihui
|
(School of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250300, China)
|
Abstract: |
Protein secondary structure prediction is an important part of protein structure research. When a large number of new prediction methods are proposed, new protein secondary structure prediction servers emerge. This paper selects seven commonly used protein secondary structure prediction servers: PSRSM, SPOT-1D, MUFOLD, Spider3, RaptorX, Psipred, and Jpred4 to evaluate their instructions and predicted effects. The evaluation data set is the 180 proteins released by the randomly selected PDB from August to November in 2018. The evaluation parameters are Q3, Sov, boundary recognition rate, internal recognition rate, corner C recognition rate, folding E recognition rate, and spiral H recognition rate. The Q3 results of the above servers were 89.96%, 88.18%, 86.74%, 85.77%, 83.61%, 79.72%, and 78.29%, which show that the prediction results of PSRSM were the best. In the 180 test sets, the results of the classification of 30%, 40%, and 70% homology show that the Q3 results of PSRSM were 89.49%, 90.53%, and 89.87%, respectively, which were superior to other servers. The experimental results suggest that protein secondary structure prediction could be further studied by combining multiple deep learning methods and using the big data training model. |
Key words: Protein Protein secondary structure prediction PSRSM Prediction method evaluation |