引用本文: | 金冬,张萌,贾藏芝.基于卷积神经网络的细菌转录终止子预测[J].生物信息学,2022,20(3):182-188. |
| JIN Dong,ZHANG Meng,JIA Cangzhi.Prediction of bacterial transcriptional terminators by using convolutional neural network[J].Chinese Journal of Bioinformatics,2022,20(3):182-188. |
|
摘要: |
在遗传学中,终止子是位于poly(A)位点下游、长度在数百碱基以内、包含多个回文序列、具有终止转录功能的DNA结构域,其主要作用是使转录终止。在原核生物基因组中有两类转录终止子,即Rho-dependent因子和Rho-independent因子。在本项研究中,提出了一种新的预测模型(TermCNN)来快速准确地识别细菌转录终止子。该模型将具有代表性的6-mer特征子集(2 537个特征)和电子—离子相互作用伪电位(EIIP)作为输入向量,利用卷积神经网络(CNN)构建预测模型。五折交叉验证和独立测试的结果表明该模型优于最新的预测模型iTerm-PseKNC。值得注意的是,该模型在跨物种试验中具有明显的优势。它可以高度精确地预测大肠杆菌(E. coli)和枯草芽孢杆菌(B. subtilis)的转录终止子。 |
关键词: 转录终止子 深度学习 特征选择 卷积神经网络 |
DOI:10.12113/202104017 |
分类号:Q939.1 |
文献标识码:A |
基金项目:国家自然科学基金(No.62071079). |
|
Prediction of bacterial transcriptional terminators by using convolutional neural network |
JIN Dong, ZHANG Meng, JIA Cangzhi
|
(School of Science, Dalian Maritime University, Dalian 116026, Liaoning, China)
|
Abstract: |
In genetics, a transcriptional terminator is a DNA domain located downstream of poly (A) site within a length of hundreds of bases, which contains multiple palindrome sequences and has the function of terminating transcription. Two classes of transcriptional terminators, Rho-dependent and Rho-independent have been found in prokaryotic genomes. In this study, a novel model(Term CNN) was proposed for identifying bacterial transcriptional terminators rapidly and accurately. The model combined representative 6-mersub-set (2 537 features) and electron-ion interaction pseudopotentials (EIIP) of nucleotides as input parameters, and convolutional neural network (CNN) was utilized to train and optimize the model. Extensive 5-fold cross-validation and independent tests showed that the model outperformed the latest prediction model iTerm-PseKNC. It is especially noted that the model achieved obviously superiority on cross-species tests. In summary, the proposed model can predict transcriptional terminators of Escherichia coli(E. coli) and Bacillus subtilis (B. subtilis) with high accuray. |
Key words: Transcriptional terminators Deep learning Feature selection Convolutional neural network |