Putative gene


A putative gene is a segment of DNA that is believed to be a gene. Putative genes can share sequence similarities to already characterized genes and thus can be inferred to share a similar function, yet the exact function of putative genes remains unknown. Newly identified sequences are considered putative gene candidates when homologs of those sequences are found to be associated with the phenotype of interest.

Examples

Examples of studies involving putative genes include the discovery of 30 putative receptor genes found in rat vomeronasal organ and the identification of 79 putative TATA boxes found in many plant genomes.

Practical importance

In order to define and characterize a biosynthetic gene cluster, all the putative genes within said cluster must first be identified and their functions must be characterized. This can be performed by complementation and knock out experiments. In the process of characterizing putative genes, the genome under study becomes increasingly well understood as more interactions can be identified. Identification of putative genes is necessary to study genomic evolution, as significant proportion of genomes make up larger families of related genes. Genomic evolution occurs by processes such as duplication of individual genes, genome segments, or entire genomes. These processes can result in loss of function, altered function, or gain of function, and have drastic affects on the phenotype.
DNA mutations outside of a putative gene can act by positional effect, in which they alter the gene expression. These alterations leave the transcription unit and promoter of the gene intact, but may involve distal promoters, enhancer/silencer elements, or the local chromatin environment. These mutations can be associated with diseases or disorders associated with the gene.

Identification

Putative genes can be identified by clustering large groups of sequences by patterns and arranging by mutual similarity or can be inferred by potential TATA boxes.
Putative genes can also be identified by recognizing differences between well-known gene clusters and gene clusters with a unique profiling.
Software tools have been developed in order to automatically identify putative genes. This is done by searching for gene families and testing the validity of uncharacterized genes by comparison to already identified genes.
Protein products can be identified and used to characterize the putative gene that codes for it.