引用本文: | 李元,丰磊,吴玲惠,舒青龙.基于Perl脚本在NCBI网站自动或批量获取物种信息[J].生物信息学,2018,16(3):170-177. |
| LI Yuan,FENG Lei,WU Linghui,SHU Qinglong.Automatic or batch acquisition of taxonomic information in NCBI based on Perl scripts[J].Chinese Journal of Bioinformatics,2018,16(3):170-177. |
|
摘要: |
根据物种学名、分类号、任意一段核酸或蛋白质的序列,判定其属于什么物种及其详细分类的信息如何,是生物信息分析的最为基础且重要的环节,但该过程的分析及结果的获取均为手动,费时费力且容易出错。本研究旨在解决如何在NCBI网站上自动或批量获取物种信息。通过解析NCBI在线BLAST结果及其网页源程序特点,利用Perl语言编写自动化脚本,以达到批量获取查询或比对结果的物种分类信息。本研究编写的Perl语言脚本可解决序列在NCBI在线比对后自动或批量获取物种的分类信息问题,适用于细菌、真菌、动物、植物等物种学名、分类号、核酸或蛋白质的任意序列,可以为同行生物数据分析提供参考。 |
关键词: Perl脚本 基因序列 物种分类信息 NCBI |
DOI:10.12113/j.issn.1672-5565.201802002 |
分类号:Q343.1 |
文献标识码:A |
基金项目:国家自然科学基金资助项目(No.31560038和81473455);江西省自然科学基金资助项目(No.20171BAB205087);江西中医药大学2017年校级大学生创新创业训练计划项目. |
|
Automatic or batch acquisition of taxonomic information in NCBI based on Perl scripts |
LI Yuan,FENG Lei,WU Linghui,SHU Qinglong
|
(School of Life Sciences, Jiangxi University of Traditional Chinese Medicine, Nanchang 330004, China)
|
Abstract: |
It is the most basic and important to determine the species and its taxonomic information according to scientific name, taxonomy IDs and any DNA/protein sequences in bioinformatics analysis. Unfortunately, such process and ways to obtain results are currently manual, time-consuming, and prone to error. The purpose of this study is to solve the problem of automatic or batch acquisition of taxonomic information on the NCBI website. By analyzing the NCBI online BLAST results and its web source program features, we used the Perl language scripts to automatically or batch obtain the query or comparison results of taxonomic information. These Perl scripts written in this study can solve the problem of automatic getting taxonomic information after NCBI online alignment. These scripts are suitable for scientific names, taxonomy IDs, any sequences of nucleic acids or proteins which belong to bacterial, fungal, animals, plants, etc. In addition, these scripts can provide a reference for the analysis of biological data. |
Key words: Perl scripts Gene sequence Taxonomic information NCBI |