PANTHER
In bioinformatics, the PANTHER classification system is a large curated biological database of gene/protein families and their functionally related subfamilies that can be used to classify and identify the function of gene products. PANTHER is part of the Gene Ontology Reference Genome Project designed to classify proteins and their genes for high-throughput analysis.
The project consists of both manual curation and bioinformatics algorithms. Proteins are classified according to family, molecular function, biological process and pathway. It is one of the databases feeding into the European Bioinformatics Institute's InterPro database.—Application of PANTHER—The most important application of PANTHER is to accurately infer the function of uncharacterized genes from any organism based on their evolutionary relationships to genes with known functions. By combining gene function, ontology, pathways and statistical analysis tools, PANTHER enables biologists to analyze large-scale, genome-wide data obtained from the current advance technology including: sequencing, proteomics or gene expression experiments.
Shortly, using the data and tools on the PANTHER, users will be able to:
- Obtain information about a particular gene of interest.
- Discover protein families and subfamilies, pathways, biological processes, molecular functions and cellular components.
- Create lists of genes related to a particular protein family/subfamily, molecular function, biological process or pathway.
- Analyze lists of genes, proteins or transcripts.
PANTHER history
- 1998:Project was launched at Molecular Application Group.
- 1999:Acquired by Celera Genomics.
- 2000:PANTHER 1 released in Celera Discovery Systems.
- 2001: PANTHER 2 released, which is used in the annotationon of the first published human genome Celera.
- 2002: PANTHER 3 released. PANTHER annotations are integrated in FlyBase. Moved to ABI.
- 2003: PANTHER 4 released with the public release of PANTHER Classification System.
- 2005: PANTHER 5 released with PANTHER Pathway and analysis tool. Establish collaboration with InterPro.
- 2006: PANTHER 6 released. Move to SRI.
- 2010: PANTHER 7 released.
- 2011: Move to USC.
- 2012: PANTHER 8 released.
- 2014: PANTHER 9 released.
- 2015: PANTHER 10 released.
- 2016: PANTHER 11 released.
Phylogenetic tree
- Each node is annotated by gene attributes including “subfamily membership”, “protein class”, “gene function”. These attributes are heritable. Swiss-Prot protein names are usually used to name subfamilies. Since PANTHER is part of the GO reference genome project, the Gene Ontology terms are used for gene function. PANTHER/X ontology terms are used for protein class.
- Each internal node is annotated by evolutionary events such as “speciation”, “gene duplication” and “horizontal gene transfer”.
PANTHER library data generation process
The process for data generation is divided into three steps:- Family Clustering
- Pythologentic Tree Building
- Annotation of Tree Nodes
Family clustering
Sequence set
PANTHER trees depicts gene family evolution from a broad selection of genomes which are fully sequenced. PANTHER have one sequence per gene so that the tree can represent event occurred over the course of evolution i.e duplication, speciation.PANTHER genomes set are selected based on the following criteria:
- The set should include a major experimental model organism, this will assist in depicting functional information of the organism which are less studied.
- The set should include a broad taxonomic range of other genomes, preferably fully sequenced and annotated, this will assist in relating experimental model organism.
Family clusters
- The family must contain at least five members among which at least one gene has to be from a GO reference genome.
- In order to support phylogenetic inference, the family must contain a high quality sequence alignment.
- The assessment of multiple aligned sequence is done by assessing a length of the aligned sequence, at least 30 sites aligned across 75% or more of family members.
Phylogenetic tree building
Annotation of tree nodes
Each node in PANTHER tree is annotated with heritable attribute. Heritable attribute can be of three types subfamily membership, gene function and protein class membership. These annotation of nodes applies to primary sequence which was used to construct tree. In applying these annotation to primary sequence simple evolutionary principle is used i.e. each node annotation is propagated by its decedent node.PANTHER components
PANTHER/LIB : Library consists of collection of books. Each of these books represents a protein family. There are a Hidden Markov Model, a multiple sequence alignment and a family tree for each protein family in the library.PANTHER/X : Index contains abbreviated ontology which assist in summarizing, navigating molecular function and biological function. Although PANTHER/X ontology has a hierarchical organization, it is a directed acyclic graph and so when it is biologically justified, child categories appear under more than one parent. PANTHER/X has been mapped to GO and arranged in a different way to facilitate large scale analysis of proteins.
PANTHER Pathways
PANTHER includes 176 pathway using CellDesigner tool. PANTHER pathways can be downloaded in the following file formats.- Systems Biology Markup Language
- Systems Biology Graphical Notation
- BioPAX
Recent versions of PANTHER and their statistics and updates
Version 6.0
Version 6 uses UniProt sequences as training sequences. There are 19132 UniProt training sequences directly associated with the pathway components. This version has ~1500 reactions in 130 pathways, and the number of pathways associated with subfamilies were expanded. PANTHER became a member of the InterPro Consortium. The availability of PANTHER data was improved. The PANTHER/LIB version 6.1 contains 221609 UniProt sequences from 53 organisms, grouped into 5546 families and 24561 subfamilies.Version 7.0
In this version the phylogenetic trees represent speciation and gene duplication events. Identification of gene orthologs is possible. There are more support for alternative database identifiers for genes, proteins and microarray probes. PANTHER version 7 uses the SBGN standard to depict biological pathways. It includes 48 set of genomes. To define the new families and in collaboration with the European Bioinformatics Institute’s InterPro group, approximately 1000 families of non-animal genomes were added in this version. The sources of gene sets included model organism databases, Ensembl genome annotation and Entrez Gene. Since this version, a stable identifier to each node in the tree is used. This stable identifier is a nine-digit number with the prefix PTN.Version 8.0 (2012)
The reference proteome set maintained by the UniProt resource is used in this version of PANTHER and so the source of gene sets is UniProt. It includes 82 set of genomes and 991985 protein coding genes from which 642319 genes have been used for family clusters. PANTHER website is redesigned to facilitate common user workflow.Version 9.0 (2014)
This version contains 7180 protein families, divided into 52,768 functionally distinct protein subfamilies. Version 9.0 has genomes of all 85 organisms.Version 11.1 (2016)
This version contains 78442 subfamilies and 1,064,054 genes annotated.PANTHER website
The home page of PANTHER website shows several folder tabs for major workflows, including: gene list analysis, browse, sequence search, cSNP scoring, and keyword search. The details about each of these workflow are provided below.Gene list analysis
This tab is selected by default because this the most frequently used option. You can enter valid IDs in the box or upload a file, then select list type, choose organism of interest and select the type of analysis.A practical example:
Let's try this workflow using an example of a small gene list containing three genes AKT1, AKT2, AKT3. We first type these gene names within the box and separate them by comma. We select "ID list" as list type, "Homo Sapiens" as organism, and " Functional classification viewed in gene list" as the type of operation; then click submit. It gives you the information for all the three genes which are:
- Gene IDs from Ensembl and protein IDs from Uniprot: in terms of this example, you must see "ENSG00000142208" and "P31749".
- Mapped IDs: these are simply the names of the genes which have been mapped to your query
- Gene names, gene symbols, and the orthologs: the orthologs are clickable and by clicking on them you can see the list of other organisms and their IDs as well as the type of orthologs.
- PANTHER family and subfamily: This will give you the name of family and subfamily for your genes. There are some links, e.g. a link to the family tree, which is clickable. Finally you will have the genes from different species assigned to that subfamily. In this example you have the PANTHER subfamily "PTHR24352:SF30" for AKT1.
- GO molecular function: This tell you what are the functions of your query gene; e.g. AKT1 has protein kinase activity and can selectively and non-covalently interact with calcium ions, calmodulin, and phospholipids.
- GO biological process: By looking at this column, you will understand what biological processes the gene involved in; e.g. AKT1 has role in gamete generation, apoptosis, cell cycle, etc.
- GO cellular component: It tells you where in the cell you can find your query protein. In our example, the information is not available but if you try another examples, you will see some cellular components such as "nucleus", "cytoplasm", "chromosomes", etc.
- PANTHER protein class: this gives you the names and IDs of PANTHER protein class for each of the genes; e.g. AKT1 is under PANTHER protein class "non-receptor serine/threonine protein kinase" with class ID "PC00167". You can also see its parent and child lineage.
- Pathways: A list of clickable names of the pathways in which your query gene exists will be shown; e.g. AKT1 is involved in several pathways such as "Hypoxia response via HIF", "Apoptosis signaling pathway", "PI3 kinase pathway", etc.
- Species: This is the name of species you have chosen; in this case we chose "Homo sapiens".
Browse