
  • 2025年第23卷
  • 2024年第22卷
  • 2023年第21卷
  • 2022年第20卷
  • 2021年第19卷
  • 2020年第18卷
  • 2019年第17卷
  • 2018年第16卷
  • 2017年第15卷
  • 2016年第14卷
  • 2015年第13卷
  • 2014年第12卷
  • 2013年第11卷
  • 第1期
  • 第2期

主管单位 工业和信息化部 主办单位 哈尔滨工业大学 主编 任南琪 国际刊号ISSN 1672-5565 国内刊号CN 23-1513/Q

LUO Jingchu.A brief introduction to UniProt[J].Chinese Journal of Bioinformatics,2019,17(3):131-144.
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  下载PDF阅读器  关闭
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 7860次   下载 7623 本文二维码信息
分享到: 微信 更多
(北京大学 生命科学学院,北京100871)
关键词:  数据库  蛋白质序列  蛋白质功能  数据库注释  数据库交叉链接  数据库高级检索
A brief introduction to UniProt
LUO Jingchu
(College of Life Sciences, Peking University, Beijing 100871, China)
The Universal Protein Resource (https://www.uniprot.org/, UniProt) is a well-known protein database, which consists of the UniProt knowledgebase (UniProtKB), the UniProt unique protein identifier archive (UniParc), and the UniProt reference sequence clusters (UniRef). Apart from protein sequence data, the UniProtKB has comprehensive annotations and is the core of the database. UniProtKB/Swiss-Prot has more than 500 thousand entries and is a manually reviewed and annotated subset of UniProtKB, while the UniProtKB/TrEMBL contains more than 140 million un-reviewed sequences which are translated from the coding sequences in the nucleotide database EMBL and computationally annotated based on certain rules. UniParc merges the same sequence stored in UniProtKB and other available protein sequence databases into a single record to avoid redundancy and gives each record a permanent and unique identifier. UniRef clusters the UniProtKB and the selected UniParc sequences into three different sets, i.e., UniRef100, UniRef90, and UniRef50, according to their sequence identity. The UniProt website provides users with an easy-to-use and highly efficient interface for advanced search and various help documents. The UniProt database releases statistics published online along with the update of the database every four weeks, which lists useful information such as the number of newly added and updated entries, the sequence types and their taxonomic sources, as well as general annotations, sequence features, and database cross-references. UniProt has been serving the user community of life sciences as the most-comprehensive, well-annotated, non-redundant, and freely-accessible resource of protein sequence and function since it was established at the beginning of this century.
Key words:  Database  Protein sequence  Protein function  Database annotation  Database cross-reference  Database query

