The trefoil knotfold is a protein fold in which the protein backbone is twisted into a trefoil knot shape. "Shallow" knots in which the tail of the polypeptide chain only passes through a loop by a few residues are uncommon, but "deep" knots in which many residues are passed through the loop are extremely rare. Deep trefoil knots have been found in the SPOUT superfamily. including methyltransferase proteins involved in posttranscriptional RNA modification in all three Domains of Life, including bacteriumThermus thermophilus and proteins, in archaea and in eukaryota. In many cases the trefoil knot is part of the active site or a ligand-binding site and is critical to the activity of the enzyme in which it appears. Before the discovery of the first knotted protein, it was believed that the process of protein folding could not efficiently produce deep knots in protein backbones. Studies of the folding kinetics of a dimeric protein from Haemophilus influenzae have revealed that the folding of trefoil knot proteins may depend on proline isomerization. Computational algorithms have been developed to identify knotted protein structures, both to canvas the Protein Data Bank for previously undetected natural knots and to identify knots in protein structurepredictions, where they are unlikely to accurately reproduce the native-state structure due to the rarity of knots in known proteins. Currently, there is a web server pKNOT available to detect knots in proteins as well as to provide information on knotted proteins in the Protein Data Bank. Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences, three-dimensional structures and functions. Inter knottin similarity lies mainly between 20% and 40% sequence identity and 1.5 to 4 A backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.
Trefoil domain is a cysteine-rich domain of approximately forty five amino-acid residues has been found in some extracellular eukaryotic proteins. It is known as either the 'P', 'trefoil' or 'TFF' domain, and contains six cysteines linked by three disulphide bonds with connectivity 1-5, 2-4, 3-6. The domain has been found in a variety of extracellular eukaryotic proteins, including protein pS2 a protein secreted by the stomach mucosa; spasmolytic polypeptide , a protein of about 115 residues that inhibits gastrointestinal motility and gastric acidsecretion; intestinal trefoil factor ; Xenopus laevis stomach proteins xP1 and xP4; xenopus integumentary mucins A.1 and C.1, proteins which may be involved in defense against microbial infections by protecting the epithelia from the external environment; xenopus skin protein xp2 ; Zona pellucidasperm-binding protein B ; intestinal sucrase-isomaltase, a vertebrate membrane bound, multifunctional enzyme complex which hydrolyzes sucrose, maltose and isomaltose; and lysosomal alpha-glucosidase.
Examples
Human gene encoding proteins containing the trefoil domain include: