摘要: |
本文介绍欧洲分子生物学开放软件包EMBOSS序列分析程序应用实例。第1节简单介绍EMBOSS软件包的概况和基本用法。第2节介绍格式转换、序列提取、序列变换和序列显示等常用序列处理程序。第3节介绍序列比对程序,包括双序列比对、多序列比对和点阵图程序。第4节介绍常用核酸序列分析程序,可用于核苷酸组分统计、开放读码框分析、CpG岛识别、密码子使用统计和重复序列寻找等。第5节介绍常用蛋白质序列分析程序,包括氨基酸组分统计、序列特征位点识别、二级结构分析等。文中结合教学实例,选择部分常用程序,给出具体运行方式,并扼要说明分析结果的生物学意义。文末对程序运行过程中需要注意的地方加以讨论,并用表格列出部分常用程序的名称和用途,以便读者查阅。 |
关键词: EMBOSS软件包 双序列比对 多序列比对 点阵图 核酸序列分析 蛋白质序列分析 |
DOI:10.12113/202008002 |
分类号:Q349+.53 |
文献标识码:A |
基金项目: |
|
Application examples of EMBOSS sequence analysis program |
LUO Jingchu
|
(College of Life Sciences and Center for Bioinformatics, Peking University, Beijing 100871, China)
|
Abstract: |
The aim of this paper is to introduce the European Molecular Biology Open Software Suite (EMBOSS) with practical application examples. In the first section, a brief overview about EMBOSS is given and general usages of the programs are described. The second section introduces tools for sequence format conversion, sequence retrieval, manipulation, and display. The third section presents sequence alignment programs including pairwise and multiple sequence alignment as well as dot-plot. The fourth section reviews commonly used nucleotide sequence analyses, which can be used in composition statistics, open reading frame analysis, CpG island prediction, codon usage statistics, and repeat sequence identification. Protein sequence analysis programs such as amino acid composition calculation, sequence motif discovery, and secondary structure analysis are summarized in the last section. Application examples for some commonly used programs are described based on teaching experiences. The specific operation steps to run the programs and the biological significance of the analysis results are elucidated.Lastly, special notes are discussed for the purpose of better use of the programs, and a summary table containing the names and usages of some programs is given. |
Key words: EMBOSS Pairwise sequence alignment Multiple sequence alignment Dot-plot Nucleotide sequence analysis Protein sequence analysis |