List of known Interpro domains in Plasmodium falciparum


G3DSA:1.10.10.10 - Wing_hlx_DNA_bd (Gene3D link)

Interpro entry IPR011991 : (Interpro link)

Interpro description:

Winged helix DNA-binding proteins share a related winged helix-turn-helix DNA-binding motif, where the "wings", or loops, are small beta-sheets. The winged helix motif consists of two wings (W1, W2), three alpha helices (H1, H2, H3) and three beta-sheets (S1, S2, S3) arranged in the order H1-S1-H2-H3-S2-W1-S3-W2. The DNA-recognition helix makes sequence-specific DNA contacts with the major groove of DNA, while the wings make different DNA contacts, often with the minor groove or the backbone of DNA. Several winged-helix proteins display an exposed patch of hydrophobic residues thought to mediate protein-protein interactions.

Many different proteins with diverse biological functions contain a winged helix DNA-binding domain, including transcriptional repressors such as biotin repressor, LexA repressor and the arginine repressor; transcription factors such as the hepatocyte nuclear factor-3 proteins involved in cell differentiation, heat-shock transcription factor, and the general transcription factors TFIIE and TFIIF; helicases such as RuvB that promotes branch migration at the Holliday junction, and CDC6 in the pre-replication complex; endonucleases such as FokI and TnsA; histones; and Mu transposase, where the flexible wing of the enhancer-binding domain is essential for efficient transposition.

Proteins where this domain is known:
PF08_0094    PF10_0174    PF11_0192    PF11_0469    PF14_0025    PFF1015w   


G3DSA:1.10.10.140 - Cyt_c_oxidase_6B (Gene3D link)

Interpro entry IPR003213 : Cytochrome c oxidase, subunit VIb (Interpro link)

Interpro description:

Cytochrome c oxidase is an oligomeric enzymatic complex that is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptide subunits. One of these subunits is the potentially haem-binding subunit, VIb, which is encoded in the nucleus.

Proteins where this domain is known:
PFI1375w   


G3DSA:1.10.10.250 - Ribosomal_L11 (Gene3D link)

Interpro entry IPR000911 : Ribosomal protein L11 (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.

Proteins where this domain is known:
PF11_0113   


G3DSA:1.10.10.440 - FF (Gene3D link)

Interpro entry IPR002713 : (Interpro link)

Interpro description:
The FF domain may be involved in protein-protein interaction. It often occurs as multiple copies and often accompanies WW domains PRP40 from yeast encodes a novel, essential splicing component that associates with the yeast U1 small nuclear ribonucleoprotein particle.

Proteins where this domain is known:
PF13_0091   


G3DSA:1.10.10.460 - G3DSA:1.10.10.460 (Gene3D link)

Proteins where this domain is known:
PFF1150w   


G3DSA:1.10.10.540 - XPC-bd (Gene3D link)

Interpro entry IPR015360 : XPC-binding domain (Interpro link)

Interpro description:

Members of this entry adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair.

Proteins where this domain is known:
PF10_0114   


G3DSA:1.10.10.60 - Homeodomain-rel (Gene3D link)

Interpro entry IPR012287 : Homeodomain-related (Interpro link)

Interpro description:

Homeodomain proteins are transcription factors that share a related DNA binding homeodomain. The homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well conserved in many other animals, including vertebrates. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two alpha helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. Many proteins contain homeodomains, including Drosophila Engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and HOX proteins.

The homeodomain motif is very similar in sequence and structure to domains in a wide range of DNA-binding proteins, including recombinases, Myb proteins, GARP response regulators, human telomeric proteins (hTRF1), paired domain proteins (PAX), yeast RAP1, centromere-binding proteins CENP-B and ABP-1, transcriptional regulators (TyrR), AraC-type transcriptional activators, and tetracycline repressor-like proteins (TetR, QacR, YcdC).

Proteins where this domain is known:
PF07_0027    PF10_0327   


G3DSA:1.10.10.600 - ISC_FeS_clus_asmbl_IscsX (Gene3D link)

Interpro entry IPR007479 : ISC system FeS cluster assembly, IscX (Interpro link)

Interpro description:

Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.

The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.

This entry represents IscX proteins (also known as hypothetical protein YfhJ) that are part of the ISC system. IscX is active as a monomer. The structure of YfhJ is an orthogonal alpha-bundle. YfhJ is a small acidic protein that binds IscS, and contains a modified winged helix motif that is usually found in DNA-binding proteins. YfhJ/IscX can bind Fe, and may function as an Fe donor in the assembly of FeS clusters

Proteins where this domain is known:
MAL13P1.307   


G3DSA:1.10.1000.11 - G3DSA:1.10.1000.11 (Gene3D link)

Proteins where this domain is known:
PF14_0407   


G3DSA:1.10.1040.10 - Opine_DH (Gene3D link)

Interpro entry IPR013328 : Dehydrogenase, multihelical (Interpro link)

Interpro description:

This entry represents a multi-helical domain found in several NAD or NADP-utilizing dehydrogenases, including 6-phosphogluconate dehydrogenase, classes I and II ketol-acid reductoisomerases, L-3-hydroxyacyl CoA dehydrogenase, UDP-glucose dehydrogenase, glycerol-3-phosphate dehydrogenase, ketopantoate reductase, N-(1-D-carboxylethyl)-L-norvaline dehydrogenase, and mannitol 2-dehydrogenase. This domain is often found in the C-terminal region of the protein.

Proteins where this domain is known:
PF11_0157    PF14_0520    PFL0780w   


G3DSA:1.10.1060.10 - Fum_reductase_C (Gene3D link)

Interpro entry IPR012285 : Fumarate reductase, C-terminal (Interpro link)

Interpro description:

Fumarate reductase catalyses the reduction of fumarate to succinate, coupling the reaction to the oxidation of quinol to quinine. This reaction is opposite to that catalysed by succinate dehydrogenase. This entry represents the C-terminal domain of fumarate reductase, which is structurally related to the N-terminal domain of dihydropyrimidine dehydrogenase, an enzyme that catalyses the NADPH-dependent conversion of pyrimidines to 5,6-dihydro compounds.

Proteins where this domain is known:
PF14_0334    PFL0630w   


G3DSA:1.10.1070.11 - PI3/4_kinase_cat (Gene3D link)

Interpro entry IPR000403 : Phosphatidylinositol 3- and 4-kinase, catalytic (Interpro link)

Interpro description:

Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.

The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.

Proteins where this domain is known:
MAL13P1.19    PFD0965W    PFE0485w    PFE0765w   


G3DSA:1.10.1090.10 - Cyt_bd_Ubol_Oase_14kDa-su (Gene3D link)

Interpro entry IPR003197 : Cytochrome bd ubiquinol oxidase, 14 kDa subunit (Interpro link)

Interpro description:

The cytochrome bd type terminal oxidases catalyse quinol dependent, Na+ independent oxygen uptake. Members of this family are integral membrane proteins and contain a protoheame IX centre B558.

Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy.

The 14 kDa (or VI) subunit of the complex is not directly involved in electron transfer, but has a role in assembly of the complex.

Proteins where this domain is known:
PF10_0120   


G3DSA:1.10.1140.10 - G3DSA:1.10.1140.10 (Gene3D link)

Proteins where this domain is known:
PFL1725w   


G3DSA:1.10.1160.10 - G3DSA:1.10.1160.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.281    PF13_0170   


G3DSA:1.10.1200.10 - ACP_like (Gene3D link)

Proteins where this domain is known:
PFB0385w    PFL0415w   


G3DSA:1.10.1200.60 - Ribosomal_L19/L19e_dom3 (Gene3D link)

Interpro entry IPR015974 : Ribosomal protein L19/L19e, domain 3 (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker. This entry represents an alpha-helical domain that assumes an orthogonal bundle topology.

Proteins where this domain is known:
PFF0700c   


G3DSA:1.10.1300.10 - G3DSA:1.10.1300.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.118    MAL13P1.119    PF14_0672    PFL0475w   


G3DSA:1.10.132.10 - TopoI_cat_a/b-sub_euk (Gene3D link)

Interpro entry IPR014727 : DNA topoisomerase I, catalytic core, alpha/beta subdomain, eukaryotic-type (Interpro link)

Interpro description:

DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

This entry represents the alpha/beta subdomain that comprises part of the catalytic core of eukaryotic and viral topoisomerase I (type IB) enzymes, which occurs near the C-terminal region of the protein.

Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.

Vaccinia virus, a cytoplasmically-replicating poxvirus, encodes a type I DNA topoisomerase that is biochemically similar to eukaryotic-like DNA topoisomerases I, and which has been widely studied as a model topoisomerase. It is the smallest topoisomerase known and is unusual in that it is resistant to the potent chemotherapeutic agent camptothecin. The crystal structure of an amino-terminal fragment of vaccinia virus DNA topoisomerase I shows that the fragment forms a five-stranded, antiparallel beta-sheet with two short alpha-helices and connecting loops. Residues that are conserved between all eukaryotic-like type I topoisomerases are not clustered in particular regions of the structure.

More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

Proteins where this domain is known:
PFE0520c   


G3DSA:1.10.132.20 - G3DSA:1.10.132.20 (Gene3D link)

Proteins where this domain is known:
PFB0390w   


G3DSA:1.10.1320.10 - G3DSA:1.10.1320.10 (Gene3D link)

Proteins where this domain is known:
PF11_0264   


G3DSA:1.10.1410.10 - G3DSA:1.10.1410.10 (Gene3D link)

Proteins where this domain is known:
PFF1240w   


G3DSA:1.10.1420.10 - G3DSA:1.10.1420.10 (Gene3D link)

Proteins where this domain is known:
MAL7P1.206    PF14_0254    PFE0270c   


G3DSA:1.10.150.110 - DNA_pol_b_N-like (Gene3D link)

Interpro entry IPR010996 : DNA-directed DNA polymerase, family X, beta-like, N-terminal (Interpro link)

Interpro description:

Mammalian DNA polymerase beta (polB) is a 39-kDa protein with both nucleotidyltransferase and 5'-deoxyribose phosphodiesterase activities, playing a role in both excision repair and meiosis. polB has a modular organisation with an 8-kDa N-terminal domain (NTD) connected to the 31-kDa C-terminal domain by a protease-hypersensitive hinge region. The NTD acts as a single-stranded DNA binding domain, interacting most efficiently with the 5'-phosphate of the downstream primer of the gapped DNA. This interaction is mediated by a helix-hairpin-helix motif (HhH), which is also found in several other DNA repair enzymes. The residue threonine 79 (T79), which is located within the NTD, was identified as being critical to polB function, even though it makes no contact with either DNA template or dNTP substrate; T79 is located between two HhH motifs, and acts as a hinge residue that is important for positioning the DNA within the active site.

The catalytic core (residues 148-242) of murine terminal deoxynucleotidyl transferase (TdT) displays a structural fold that is similar to polB, and shares a common two-metal ion mechanism of nucleotidyl transfer with polB. TdT elongates DNA strands in a template-independent manner, and belongs to the pol X family of polymerases. TdT has only been found in vertebrates, where it is highly conserved. TdT brings additional diversity in the immune repertoire by adding nucleotides, called N regions, to the V(D)J recombination junction sites of immunoglobulin and T-cell receptor genes.

Proteins where this domain is known:
PF14_0470   


G3DSA:1.10.150.190 - G3DSA:1.10.150.190 (Gene3D link)

Proteins where this domain is known:
PF07_0117   


G3DSA:1.10.150.20 - G3DSA:1.10.150.20 (Gene3D link)

Proteins where this domain is known:
MAL8P1.76    PF11_0087    PF11_0264    PF14_0112    PFB0265c    PFD0420c    PFF1225c   


G3DSA:1.10.150.50 - SAM_type (Gene3D link)

Interpro entry IPR013761 : (Interpro link)

Interpro description:

Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing proteins pathway. SAM domains exhibit a conserved structure, consisting of a 4-5-helical bundle of two orthogonally packed alpha-hairpins. However SAM domains display a diversity of function, being involved in interactions with proteins, DNA and RNA. The name sterile alpha motif arose from its presence in proteins that are essential for yeast sexual differentiation. The SAM domain has had various names, including SPM, PTN (pointed), SEP (yeast sterility, Ets-related, PcG proteins), NCR (N-terminal conserved region) and HLH (helix-loop-helix) domain, all of which are related and can be classified as SAM domains.

SAM domains occur in eukaryotic and in some bacterial proteins. Structures have been determined for several proteins that contain SAM domains, including Ets-1 transcription factor, which plays a role in the development and invasion of tumour cells by regulating the expression of matrix-degrading proteases; Etv6 transcription factor, gene rearrangements of which have been demonstrated in several malignancies; EphA4 receptor tyrosine kinase, which is believed to be important for the correct localization of a motoneuron pool to a specific position in the spinal cord; EphB2 receptor, which is involved in spine morphogenesis via intersectin, Cdc42 and N-Wasp; p73, a p53 homologue involved in neuronal development; and polyhomeotic, which is a member of the Polycomb group of genes (Pc-G) required for the maintenance of the spatial expression pattern of homeotic genes.

Proteins where this domain is known:
PF11_0079   


G3DSA:1.10.150.60 - ARID (Gene3D link)

Interpro entry IPR001606 : AT-rich interaction region (Interpro link)

Interpro description:

Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.

The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions.

Proteins where this domain is known:
PFF0175c   


G3DSA:1.10.1540.10 - Beige_BEACH (Gene3D link)

Interpro entry IPR000409 : (Interpro link)

Interpro description:

The "beige" mouse is established as an animal model of Chediak-Higashi Syndrome (CHS). The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein It is also found in distantly related proteins like, for example,andwhich are factor associated with neutral sphingomyelinase activation.

The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.

Proteins where this domain is known:
PF11_0252   


G3DSA:1.10.1580.10 - G3DSA:1.10.1580.10 (Gene3D link)

Proteins where this domain is known:
PF14_0345    PFD0530c   


G3DSA:1.10.1620.10 - Ribosomal_L39 (Gene3D link)

Interpro entry IPR000077 : Ribosomal protein L39e (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes.

Proteins where this domain is known:
PFF0573c   


G3DSA:1.10.1620.20 - ATPase_F1_e_mt (Gene3D link)

Interpro entry IPR006721 : ATPase, F1 complex, epsilon subunit, mitochondrial (Interpro link)

Interpro description:

ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

This family constitutes the mitochondrial ATP synthase epsilon subunit, which is distinct from the bacterial epsilon subunit (the latter being homologous to the mitochondrial delta subunit). The mitochondrial epsilon subunit is located in the stalk region of the F1 complex, and acts as an inhibitor of the ATPase catalytic core. The epsilon subunit can assume two conformations, contracted and extended, where the latter inhibits ATP hydrolysis. The conformation of the epsilon subunit is determined by the direction of rotation of the gamma subunit, and possibly by the presence of ADP. The extended epsilon subunit is thought to become extended in the presence of ADP, thereby acting as a safety lock to prevent wasteful ATP hydrolysis.

More information about this protein can be found at Protein of the Month: ATP Synthases.

Proteins where this domain is known:
MAL7P1.75   


G3DSA:1.10.1650.10 - Ribosomal_L19/L19e_dom1 (Gene3D link)

Interpro entry IPR015972 : Ribosomal protein L19/L19e, domain 1 (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker. This entry represents an alpha-helical domain that assumes an orthogonal bundle topology.

Proteins where this domain is known:
PFF0700c   


G3DSA:1.10.1670.10 - G3DSA:1.10.1670.10 (Gene3D link)

Proteins where this domain is known:
PFF0715c    PFI0835c   


G3DSA:1.10.1780.10 - G3DSA:1.10.1780.10 (Gene3D link)

Proteins where this domain is known:
PF08_0063    PF11_0175    PF14_0063   


G3DSA:1.10.1820.10 - Casein_kin_II_reg-sub_a-hlx (Gene3D link)

Interpro entry IPR016149 : Casein kinase II, regulatory subunit, alpha-helical (Interpro link)

Interpro description:

Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.

This entry represents the N-terminal alpha-helical domain, which has an orthogonal bundle topology.

Proteins where this domain is known:
PF11_0048    PF13_0232   


G3DSA:1.10.183.10 - no description (Gene3D link)

Proteins where this domain is known:
MAL13P1.148    PF11_0416    PF13_0233    PFE0175c    PFL1435c   


G3DSA:1.10.1900.20 - G3DSA:1.10.1900.20 (Gene3D link)

Proteins where this domain is known:
PF14_0709   


G3DSA:1.10.20.10 - Histone-fold (Gene3D link)

Interpro entry IPR009072 : Histone-fold (Interpro link)

Interpro description:

Histones mediate DNA organisation and plays a dominant role in regulating eukaryotic transcription. The histone-fold consists of a core of three helices, where the long middle helix is flanked at each end by shorter ones. Proteins displaying this structure include the nucleosome core histones, which form octomers composed of two copies of each of the four histones, H2A, H2B, H3 and H4; archaeal histone, which possesses only the core domain part of eukaryotic histone; and the TATA-box binding protein (TBP)-associated factors (TAF), where the histone fold is a common motif for mediating TAF-TAF interactions. TAF proteins include TAF(II)18 and TAF(II)28, which form a heterodimer, TAF(II)42 and TAF(II)62, which form a heterotetramer similar to (H3-H4)2, and the negative cofactor 2 (NC2) alpha and beta chains, which form a heterodimer. The TAF proteins are a component of transcription factor IID (TFIID), along with the TBP protein. TFIID forms part of the pre-initiation complex on core promoter elements required for RNA polymerase II-dependent transcription. The TAF subunits of TFIID mediate transcriptional activation of subsets of eukaryotic genes. The NC2 complex mediates the inhibition of TATA-dependent transcription through interactions with TBP.

Proteins where this domain is known:
PF07_0054    PF11_0061    PF11_0062    PF11_0477    PF13_0043    PF13_0185    PF14_0374    PFC0920w    PFF0510w    PFF0860c    PFF0865w   


G3DSA:1.10.220.20 - G3DSA:1.10.220.20 (Gene3D link)

Proteins where this domain is known:
PF14_0407   


G3DSA:1.10.220.40 - G3DSA:1.10.220.40 (Gene3D link)

Proteins where this domain is known:
MAL13P1.244   


G3DSA:1.10.238.10 - EF-Hand_type (Gene3D link)

Interpro entry IPR011992 : EF-Hand type (Interpro link)

Interpro description:

This domain consists of a duplication of two EF-hand units, where each unit is composed of two helices connected by a twelve-residue calcium-binding loop. The calcium ion in the EF-hand loop is coordinated in a pentagonal bipyramidal configuration. Many calcium-binding proteins contain an EF-hand type calcium-binding domain. These include: calbindin D9K, S100 proteins such as calcyclin, polcalcin phl p 7 (a calcium-binding pollen allergen), osteonectin, parvalbumin, calmodulin family of proteins (troponin C, caltractin, cdc4p, myosin essential chain, calcineurin, recoverin, neurocalcin), plasmodial-specific CaII-binding protein Cbp40, penta-EF-Hand proteins (sorcin, grancalcin, calpain), as well as multidomain proteins such as phosphoinositide-specific phospholipase C, dystrophin, Cb1 and alpha-actinin. The fold consists of four helices and an open array of two hairpins.

Proteins where this domain is known:
MAL13P1.156    MAL7P1.10    MAL7P1.69    MAL8P1.79    PF07_0072    PF10_0145    PF10_0177    PF10_0244    PF10_0271    PF10_0301    PF11_0066    PF11_0098    PF11_0239    PF11_0242    PF11_0389    PF13_0211    PF14_0181    PF14_0224    PF14_0323    PF14_0420    PF14_0443    PF14_0492    PF14_0607    PFA0305c    PFA0345w    PFA0515w    PFB0815w    PFC0190c    PFC0420w    PFD0692c    PFF0265c    PFF0520w    PFF1320c    PFL2225w   


G3DSA:1.10.240.10 - G3DSA:1.10.240.10 (Gene3D link)

Proteins where this domain is known:
MAL8P1.125    PF11_0181    PF13_0205   


G3DSA:1.10.245.10 - SWIB_MDM2 (Gene3D link)

Interpro entry IPR003121 : (Interpro link)

Interpro description:

The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins, and can be found fused to the C-terminus of DNA topoisomerase in Chlamydia.

MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded beta-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 alpha-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.

The SWIB and MDM2 domains are homologous and share a common fold.

Proteins where this domain is known:
PFF0560c   


G3DSA:1.10.260.30 - SRP54_M (Gene3D link)

Interpro entry IPR004125 : Signal recognition particle, SRP54 subunit, M-domain (Interpro link)

Interpro description:

The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

This entry represents the M domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.

These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.

Proteins where this domain is known:
PF14_0477   


G3DSA:1.10.260.40 - G3DSA:1.10.260.40 (Gene3D link)

Proteins where this domain is known:
PF11_0293   


G3DSA:1.10.268.10 - Topo_IIA_A_a (Gene3D link)

Interpro entry IPR013757 : DNA topoisomerase, type IIA, subunit A, alpha-helical (Interpro link)

Interpro description:

DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

This entry represents a mainly alpha helical domain of subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV. It does not include the topoisomerase II enzymes composed of a single polypeptide, as are found in most eukaryotes.

More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

Proteins where this domain is known:
PF14_0316   


G3DSA:1.10.285.10 - no description (Gene3D link)

Proteins where this domain is known:
PF14_0164    PF14_0286   


G3DSA:1.10.286.20 - G3DSA:1.10.286.20 (Gene3D link)

Proteins where this domain is known:
PFC0225c   


G3DSA:1.10.287.10 - S15_NS1_RNA_bd (Gene3D link)

Interpro entry IPR009068 : (Interpro link)

Interpro description:

The RNA-binding domains of the ribosomal protein S15 and the influenza virus non-structural protein NS1 share the same structural fold, consisting of three helices in an irregular array. S15 is one of 21 proteins in the small, bacterial 30S ribosomal subunit, and is required for assembly of the subunit through its binding to 16S rRNA. The multifunctional glutamyl-prolyl-tRNA synthase (EPRS) contains three tandem repeats linking two catalytic domains, all three of which contribute to RNA-binding; the second repeated element bears structural resemblance to the S15/NS1 RNA-binding domain.

Proteins where this domain is known:
PF11_0072    PF13_0059   


G3DSA:1.10.287.110 - DnaJ_N (Gene3D link)

Interpro entry IPR001623 : Heat shock protein DnaJ, N-terminal (Interpro link)

Interpro description:

The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.

Such a structure is shown in the following schematic representation:

It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.

Proteins where this domain is known:
MAL13P1.162    MAL13P1.277    MAL8P1.204    PF08_0032    PF08_0115    PF10_0032    PF10_0058    PF10_0378    PF10_0381    PF11_0034    PF11_0099    PF11_0380    PF11_0433    PF11_0443    PF11_0509    PF11_0512    PF11_0513    PF13_0036    PF13_0102    PF14_0013    PF14_0111    PF14_0137    PF14_0213    PF14_0359    PF14_0700    PFA0110w    PFA0660w    PFA0675w    PFB0085c    PFB0090c    PFB0595w    PFB0920w    PFB0925w    PFD0462w    PFE0055c    PFE0135w    PFE1170w    PFF1010c    PFF1415c    PFI0935w    PFI0985c    PFL0055c    PFL0565w    PFL0815w    PFL2550w   


G3DSA:1.10.287.20 - G3DSA:1.10.287.20 (Gene3D link)

Proteins where this domain is known:
PF14_0248   


G3DSA:1.10.287.310 - Ribosomal_L29 (Gene3D link)

Interpro entry IPR001854 : Ribosomal protein L29 (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities, groups:

Proteins where this domain is known:
PF11_0260   


G3DSA:1.10.287.370 - G3DSA:1.10.287.370 (Gene3D link)

Proteins where this domain is known:
PFE0595w   


G3DSA:1.10.287.40 - Ser-tRNA-synth_IIa_N (Gene3D link)

Interpro entry IPR015866 : Seryl-tRNA synthetase, class IIa, N-terminal (Interpro link)

Interpro description:

The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

This entry represents the N-terminal domain of Seryl-tRNA synthetase, which consists of two helices in a long alpha-hairpin. Seryl-tRNA synthetase exists as monomer and belongs to class IIa.

Proteins where this domain is known:
PF07_0073   


G3DSA:1.10.287.600 - G3DSA:1.10.287.600 (Gene3D link)

Proteins where this domain is known:
PF08_0125    PF10_0084    PF14_0725    PFD1050w    PFI0180w   


G3DSA:1.10.287.70 - G3DSA:1.10.287.70 (Gene3D link)

Proteins where this domain is known:
PF14_0622    PFL1315w   


G3DSA:1.10.287.700 - Synuclein (Gene3D link)

Interpro entry IPR001058 : Synuclein (Interpro link)

Interpro description:

Synucleins are small, soluble proteins expressed primarily in neural tissue and in certain tumors. The family includes three known proteins: alpha-synuclein, beta-synuclein, and gamma-synuclein. All synucleins have in common a highly conserved alpha-helical lipid-binding motif with similarity to the class-A2 lipid-binding domains of the exchangeable apolipoproteins.

Synuclein family members are not found outside vertebrates, although they have some conserved structural similarity with plant 'late-embryo-abundant' proteins. The alpha- and beta-synuclein proteins are found primarily in brain tissue, where they are seen mainly in presynaptic terminals. The gamma-synuclein protein is found primarily in the peripheral nervous system and retina, but its expression in breast tumors is a marker for tumor progression. Normal cellular functions have not been determined for any of the synuclein proteins, although some data suggest a role in the regulation of membrane stability and/or turnover. Mutations in alpha-synuclein are associated with rare familial cases of early-onset Parkinson's disease, and the protein accumulates abnormally in Parkinson's disease, Alzheimer's disease, and several other neurodegenerative illnesses.

Proteins where this domain is known:
PF13_0355   


G3DSA:1.10.290.10 - Topo_IA_cen_sub3 (Gene3D link)

Interpro entry IPR013826 : DNA topoisomerase, type IA, central region, subdomain 3 (Interpro link)

Interpro description:

DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

Type IA topoisomerases are comprised of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA: an N-terminal alpha/beta Toprim domain, domain 2 and the C-terminal domain 4 are winged-helix domains, and domain 3 is a beta-barrel. Domains 1 (Toprim) and 3 form the active site of the enzyme, while the winged helix domains 2 and 4 form a single-strand DNA-binding groove. This entry represents the alpha-bundle subdomain 3 of the central region of topoisomerase type IA enzymes, where the central region covers both domains 2 and 3.

More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

Proteins where this domain is known:
PF13_0251   


G3DSA:1.10.30.10 - HMG-box (Gene3D link)

Interpro entry IPR000910 : High mobility group, HMG1/HMG2 (Interpro link)

Interpro description:

High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.

The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.

Proteins where this domain is known:
MAL13P1.290    MAL8P1.72    PFL0145c    PFL0290w   


G3DSA:1.10.3030.10 - Gamete_antigen_PLAspp (Gene3D link)

Interpro entry IPR015299 : (Interpro link)

Interpro description:

Members of this family are essential for gametocytogenesis in Plasmodium falciparum. They contain a fold composed of two pseudo dyad-related repeats of the helix-turn-helix motif, serving as a platform for RNA and Src homology-3 (SH3) binding.

Proteins where this domain is known:
PFB0115w   


G3DSA:1.10.3090.10 - G3DSA:1.10.3090.10 (Gene3D link)

Proteins where this domain is known:
PF11_0212   


G3DSA:1.10.3120.10 - G3DSA:1.10.3120.10 (Gene3D link)

Proteins where this domain is known:
PF14_0249   


G3DSA:1.10.3260.10 - no description (Gene3D link)

Proteins where this domain is known:
MAL13P1.22   


G3DSA:1.10.340.30 - G3DSA:1.10.340.30 (Gene3D link)

Proteins where this domain is known:
PF11_0306    PFI0835c   


G3DSA:1.10.395.10 - G3DSA:1.10.395.10 (Gene3D link)

Proteins where this domain is known:
PF14_0352   


G3DSA:1.10.40.20 - G3DSA:1.10.40.20 (Gene3D link)

Proteins where this domain is known:
PF14_0352   


G3DSA:1.10.418.10 - Calponin-homology (Gene3D link)

Interpro entry IPR001715 : (Interpro link)

Interpro description:

The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:

A comprehensive review of proteins containing this type of actin-binding domains is given in.

The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.

Proteins where this domain is known:
PFC0305w   


G3DSA:1.10.443.10 - Phage_intgr_like (Gene3D link)

Interpro entry IPR013762 : Integrase-like, catalytic core, phage (Interpro link)

Interpro description:

Phage integrases are enzymes that mediate unidirectional site-specific recombination between two DNA recognition sequences, the phage attachment site, attP, and the bacterial attachment site, attB. Integrases may be grouped into two major families, the tyrosine recombinases and the serine recombinases, based on their mode of catalysis. Tyrosine family integrases, such as lambda integrase, utilise a catalytic tyrosine to mediate strand cleavage, tend to recognize longer attP sequences, and require other proteins encoded by the phage or the host bacteria.

The 356 amino acid lambda integrase consists of two domains: an N-terminal domain that includes residues 1-64 and is responsible for binding the arm-type sites of attP, and a C-terminal domain (CTD) that binds the lower affinity core-type sites and contains the catalytic site. The CTD can be further divided into the core-type binding domain (residues 65-169) and the catalytic core domain (170-356), the later representing this entry. The catalytic core adopts an alpha3-beta3-alpha4 fold, where one side of the beta sheet is exposed.

The recombinases Cre from phage P1, XerD from Escherichia coli and Flp from yeast are members of the tyrosine recombinase family, and have a two-domain motif resembling that of lambda integrase, as well as sharing a conserved binding mechanism. The structural fold of their catalytic core domains resemble that of Lambda integrase

Proteins where this domain is known:
MAL13P1.42   


G3DSA:1.10.455.10 - Ribosomal_S7 (Gene3D link)

Interpro entry IPR000235 : Ribosomal protein S7 (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.

Proteins where this domain is known:
PF07_0088   


G3DSA:1.10.460.10 - Topo_IA_cen_sub1 (Gene3D link)

Interpro entry IPR013824 : DNA topoisomerase, type IA, central region, subdomain 1 (Interpro link)

Interpro description:

DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

Type IA topoisomerases are comprised of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA: an N-terminal alpha/beta Toprim domain, domain 2 and the C-terminal domain 4 are winged-helix domains, and domain 3 is a beta-barrel. Domains 1 (Toprim) and 3 form the active site of the enzyme, while the winged helix domains 2 and 4 form a single-strand DNA-binding groove. This entry represents the alpha-bundle subdomain 1 of the central region of topoisomerase type IA enzymes, where the central region covers both domains 2 and 3.

More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

Proteins where this domain is known:
PF13_0251   


G3DSA:1.10.465.10 - G3DSA:1.10.465.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.148    PF11_0416    PF13_0233    PFE0175c    PFF0675c    PFL1435c   


G3DSA:1.10.472.10 - Cyclin_related (Gene3D link)

Interpro entry IPR013763 : (Interpro link)

Interpro description:

Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.

This domain is also found as the core domain in transcription factor IIB (TFIIB) and in the retinoblastoma tumour suppressor.

Proteins where this domain is known:
PF13_0022    PF14_0469    PF14_0605    PFA0525w    PFF0270c   


G3DSA:1.10.472.30 - TFIIS_centre (Gene3D link)

Interpro entry IPR003618 : Transcription elongation factor S-II, central region (Interpro link)

Interpro description:

Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.

Proteins where this domain is known:
PF07_0057    PF11_0289   


G3DSA:1.10.510.10 - G3DSA:1.10.510.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.114    MAL13P1.185    MAL13P1.196    MAL13P1.278    MAL13P1.279    MAL13P1.84    MAL7P1.100    MAL7P1.127    MAL7P1.132    MAL7P1.144    MAL7P1.175    MAL7P1.73    MAL7P1.91    MAL8P1.203    MAL8P1.42    PF07_0072    PF08_0044    PF10_0141    PF10_0160    PF10_0380    PF11_0060    PF11_0079    PF11_0096    PF11_0127    PF11_0147    PF11_0156    PF11_0220    PF11_0227    PF11_0239    PF11_0242    PF11_0377    PF11_0464    PF11_0488    PF11_0510    PF13_0085    PF13_0166    PF13_0211    PF13_0258    PF14_0227    PF14_0264    PF14_0294    PF14_0320    PF14_0346    PF14_0392    PF14_0408    PF14_0423    PF14_0431    PF14_0476    PF14_0516    PF14_0734    PFA0130c    PFA0380w    PFB0150c    PFB0520w    PFB0605w    PFB0665w    PFB0815w    PFC0060c    PFC0105w    PFC0385c    PFC0420w    PFC0485w    PFC0525c    PFC0755c    PFC0945w    PFD0740w    PFD0865c    PFD1165w    PFD1175w    PFE0045c    PFE1290w    PFF0260w    PFF0520w    PFF0750w    PFF1145c    PFF1370w    PFI0095c    PFI0100c    PFI0105c    PFI0110c    PFI0115c    PFI0120c    PFI0125c    PFI1275w    PFI1280c    PFI1290w    PFI1415w    PFI1685w    PFL0040c    PFL0080c    PFL1370w    PFL1885c    PFL2250c    PFL2280w   


G3DSA:1.10.520.20 - ATPase_F1_OSCP/d (Gene3D link)

Interpro entry IPR000711 : ATPase, F1 complex, OSCP/delta subunit (Interpro link)

Interpro description:

ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .

More information about this protein can be found at Protein of the Month: ATP Synthases.

Proteins where this domain is known:
MAL13P1.47   


G3DSA:1.10.555.10 - RhoGAP (Gene3D link)

Interpro entry IPR000198 : RhoGAP (Interpro link)

Interpro description:
Members of the Rho family of small G proteins transduce signals from plasma-membrane receptors and control cell adhesion, motility and shape by actin cytoskeleton formation. Like all other GTPases, Rho proteins act as molecular switches, with an active GTP-bound form and an inactive GDP-bound form. The active conformation is promoted by guanine-nucleotide exchange factors, and the inactive state by GTPase-activating proteins (GAPs) which stimulate the intrinsic GTPase activity of small G proteins. This entry is a Rho/Rac/Cdc42-like GAP domain, that is found in a wide variety of large, multi-functional proteins. A number of structure are known for this family. The domain is composed of seven alpha helices. This domain is also known as the breakpoint cluster region-homology (BH) domain.

Proteins where this domain is known:
PF10_0071   


G3DSA:1.10.560.10 - G3DSA:1.10.560.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.283    PF10_0153    PF11_0331    PFB0635w    PFC0285c    PFC0350c    PFC0900w    PFF0430w    PFL1425w    PFL1545c   


G3DSA:1.10.575.10 - Phospholipase_C/P1_nuclease (Gene3D link)

Interpro entry IPR008947 : Phospholipase C/P1 nuclease, core (Interpro link)

Interpro description:

The enzymes belonging to this family are involved in phosphate ester hydrolysis and contain a triad of closely spaced zinc ions at their active centres. Both families of enzymes hydrolyse phosphodiesters. Substrates for phospholipase C are phosphatidylinositol and phosphatidylcholine, while P1 nuclease is an endonuclease hydrolysing single stranded ribo- and deoxyribonucleotides. P1 nuclease also has activity as a phosphomonoesterase against 3'-terminal phosphates of nucleotides. The Zn ions in both enzymes form almost identical trinuclear sites.

Proteins where this domain is known:
PF14_0117    PF14_0119    PFI0385c   


G3DSA:1.10.580.10 - Citrate_synthase_lrg_a-sub (Gene3D link)

Interpro entry IPR016142 : Citrate synthase-like, large alpha subdomain (Interpro link)

Interpro description:

Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).

Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.

This entry represents the large alpha-helical domain from type I and II citrate synthase enzymes, as well as a homolgous domain found in the related enzyme 2-methylcitrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive.

Proteins where this domain is known:
PF10_0218    PFF0455w   


G3DSA:1.10.60.20 - G3DSA:1.10.60.20 (Gene3D link)

Proteins where this domain is known:
PFL2055w   


G3DSA:1.10.600.10 - Terpenoid_synth (Gene3D link)

Interpro entry IPR008949 : (Interpro link)

Interpro description:

Terpenoid cyclases catalyze remarkably complex cyclisation cascades that are initiated by the formation of a highly reactive carbocation in a polyisoprene substrate. The pathways of monoterpene, sesquiterpene, and diterpene biosynthesis are conveniently divided into several stages. The first encompasses the synthesis of isopentenyl diphosphate, isomerization to dimethylallyl diphosphate, prenyltransferase-catalysed condensation of these two C5-units to geranyl diphosphate (GDP), and the subsequent 1'-4 additions of isopentenyl diphosphate to generate farnesyl (FDP) and geranylgeranyl (GGDP) diphosphate. In the second stage, the prenyl diphosphates undergo a range of cyclisations based on variations on the same mechanistic theme to produce the parent skeletons of each class. Thus, GDP (C10) gives rise to monoterpenes, FDP (C15) to sesquiterpenes, and GGDP (C20) to diterpenes. These transformations catalysed by the terpenoid synthases (cyclases) may be followed by a variety of redox modifications of the parent skeletal types to produce the many thousands of different terpenoid metabolites of the essential oils, turpentines, and resins of plant origin. Terpenoid synthases enzymes provide a template for binding and stabilizing the flexible substrate in the precise orientation required for catalysis, trigger carbocation formation, chaperone the conformations of the reactive carbocation intermediates through a unique cyclisation sequence, and sequester and stabilize carbocations from premature quenching.

Proteins where this domain is known:
PF11_0295    PFB0130w   


G3DSA:1.10.620.20 - Ribncl_red_rel (Gene3D link)

Interpro entry IPR012348 : Ribonucleotide reductase-related (Interpro link)

Interpro description:

The R2 protein of ribonucleotide reductase catalyses the reduction of all four ribonucleotides to deoxyribonucleotides for use in DNA synthesis. This catalysis involves generating and storing a tyrosyl radical, which is essential for ribonucleotide reduction. The crystal structure consists of a core of four helices in a closed bundle with a left-handed twist and one crossover connection, and a bimetal-ion centre in the middle of the bundle.

This entry represents a family of proteins that are structurally related to the R2 protein of class I ribonucleotide reductase, which includes the alpha and beta subunits of methane monooxygenase hydrolase, and delta 9-stearoyl-acyl carrier protein desaturase.

Proteins where this domain is known:
PF10_0154    PF14_0053   


G3DSA:1.10.730.10 - G3DSA:1.10.730.10 (Gene3D link)

Proteins where this domain is known:
PF08_0011    PF10_0053    PF10_0340    PF13_0179    PF14_0589    PFC0470w    PFF1095w    PFL0900c    PFL1210w   


G3DSA:1.10.760.10 - Cytochrome_c_R (Gene3D link)

Interpro entry IPR009056 : Cytochrome c, monohaem (Interpro link)

Interpro description:

After cytochrome c is synthesized in the cytoplasm as apocytochrome c, it is transported through the outer mitochondrial membrane to the intermembrane space, where haem is covalently attached by thioester bonds to two cysteine residues located in the cytochrome c centre. Cytochrome c is required during oxidative phosphorylation as an electron shuttle between Complex III (cytochrome c reductase) and IV (cytochrome c oxidase). In addition, cytochrome c is involved in apoptosis in more complex organisms such as Xenopus, rats and humans. Cellular stress can induce cytochrome c release from the mitochondrial membrane. In mammals, cytochrome c triggers the assembly of the apoptosome, consisting of cytochrome c, Apaf-1 and dATP, which activates caspase-9, leading to cell death. There are several different members of the cytochrome c family with different functional roles, for instance cytochrome c549 is associated with photosystem II.

The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes. The consensus sequence for the cytochrome c centre is Cys-X-X-Cys-His, where the histidine residue is one of the two axial ligands of the haem iron. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes both mono-haem proteins and multi-haem proteins. This entry represents mono-haem cytochrome c proteins (excluding class II and f-type cytochromes), such as cytochromes c, c1, c2, c5, c555, c550 to c553, c556, and c6.

Cytochrome c-type centres are also found in the active sites of many enzymes, including cytochrome cd1-nitrite reductase as the N-terminal haem c domain, in quinoprotein alcohol dehydrogenase as the C-terminal domain, in Quinohemoprotein amine dehydrogenase A chain as domains 1 and 2, and in the cytochrome bc1 complex as the cytochrome bc1 domain.

Proteins where this domain is known:
MAL13P1.55    PF14_0038    PF14_0597   


G3DSA:1.10.8.10 - G3DSA:1.10.8.10 (Gene3D link)

Proteins where this domain is known:
PF10_0114   


G3DSA:1.10.8.100 - G3DSA:1.10.8.100 (Gene3D link)

Proteins where this domain is known:
PFL2395c   


G3DSA:1.10.8.140 - TFAR19_DNA_bd (Gene3D link)

Interpro entry IPR002836 : DNA-binding TFAR19-related protein (Interpro link)

Interpro description:

This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.

Proteins where this domain is known:
PFI0450c   


G3DSA:1.10.8.30 - G3DSA:1.10.8.30 (Gene3D link)

Proteins where this domain is known:
PFC0225c    PFD0655c   


G3DSA:1.10.8.50 - G3DSA:1.10.8.50 (Gene3D link)

Proteins where this domain is known:
PF11_0272   


G3DSA:1.10.8.60 - no description (Gene3D link)

Proteins where this domain is known:
MAL8P1.92    PF07_0047    PF08_0063    PF10_0081    PF11_0175    PF11_0203    PF11_0314    PF13_0033    PF13_0063    PF14_0063    PF14_0126    PF14_0147    PF14_0548    PF14_0601    PF14_0616    PFB0840w    PFB0895c    PFC0140c    PFD0665c    PFF0940c    PFI0355c    PFL1925w    PFL2005w    PFL2345c   


G3DSA:1.20.1050.10 - GST_C_like (Gene3D link)

Interpro entry IPR010987 : (Interpro link)

Interpro description:

In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. GST is found as a domain in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.

Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain, which adopts a 4-helical bundle fold. This entry is the C-terminal domain.

Glutaredoxin 2 (Grx2), glutathione-dependent disulphide oxidoreductases, is structurally similar to GSTs, even though they lack any sequence similarity. Grx2 is also composed of N and C terminal subdomains. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with glutathione in cellular redox regulation including the response to oxidative stress. Grx2 is dissimilar to other glutaredoxins apart from containing the conserved active site residues.

Proteins where this domain is known:
PF13_0214    PF14_0187   


G3DSA:1.20.1050.30 - no description (Gene3D link)

Proteins where this domain is known:
MAL13P1.169    PF14_0300    PFB0480w    PFL2070w   


G3DSA:1.20.1060.10 - G3DSA:1.20.1060.10 (Gene3D link)

Proteins where this domain is known:
PF14_0112   


G3DSA:1.20.1080.10 - MIP (Gene3D link)

Interpro entry IPR000425 : Major intrinsic protein (Interpro link)

Interpro description:

A number of transmembrane (TM) channel proteins can be grouped together on the basis of sequence similarities.

These include:

MIP family proteins are thought to contain 6 TM domains. Sequence analysis suggests that the proteins may have arisen through tandem, intragenic duplication from an ancestral protein that contained 3 TM domains.

Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates a ttached to lipids or proteins. Aquaporin-CHIP (Aquaporin 1) belo ngs to the Colton blood group system and is associated with Co(a/b) antigen.

Proteins where this domain is known:
PF08_0097    PF11_0338   


G3DSA:1.20.1100.10 - G3DSA:1.20.1100.10 (Gene3D link)

Proteins where this domain is known:
PF14_0246   


G3DSA:1.20.1110.10 - no description (Gene3D link)

Proteins where this domain is known:
PFA0310c    PFL0590c   


G3DSA:1.20.120.140 - SRP54_helical (Gene3D link)

Interpro entry IPR013822 : Signal recognition particle, SRP54 subunit, helical bundle (Interpro link)

Interpro description:

The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.

These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.

Proteins where this domain is known:
PF13_0350    PF14_0477   


G3DSA:1.20.120.180 - PA28_beta (Gene3D link)

Interpro entry IPR003186 : Proteasome activator pa28, REG beta subunit (Interpro link)

Interpro description:

PA28 activator complex (also known as 11S regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha (PA28alpha) and beta (PA28beta) subunits. The catalytic properties of PA28alpha and PA28beta-activated proteosome are similar. This entry represents the beta subunit. The activator complex binds to the 20S proteasome and stimulates peptidase activity in and ATP-independent manner.

Proteins where this domain is known:
PFI0370c   


G3DSA:1.20.120.310 - Evr1_Alr (Gene3D link)

Interpro entry IPR006863 : Erv1/Alr (Interpro link)

Interpro description:
Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae (Baker's yeast) mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian ortholog of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane and it is thought to operate downstream of the mitochondrial ABC transporter..

Proteins where this domain is known:
PFA0500w   


G3DSA:1.20.1310.10 - G3DSA:1.20.1310.10 (Gene3D link)

Proteins where this domain is known:
PF08_0094    PFF1445c   


G3DSA:1.20.1390.10 - G3DSA:1.20.1390.10 (Gene3D link)

Proteins where this domain is known:
PFC0465c   


G3DSA:1.20.1460.10 - G3DSA:1.20.1460.10 (Gene3D link)

Proteins where this domain is known:
PFA0300c   


G3DSA:1.20.150.20 - G3DSA:1.20.150.20 (Gene3D link)

Proteins where this domain is known:
PFB0795w   


G3DSA:1.20.190.20 - 14-3-3 (Gene3D link)

Interpro entry IPR000308 : 14-3-3 protein (Interpro link)

Interpro description:

The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.

14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

Proteins where this domain is known:
MAL13P1.309    MAL8P1.69    PF14_0220   


G3DSA:1.20.200.10 - no description (Gene3D link)

Proteins where this domain is known:
PFB0295w   


G3DSA:1.20.272.10 - no description (Gene3D link)

Proteins where this domain is known:
PF14_0601    PFB0840w    PFL2005w   


G3DSA:1.20.5.110 - no description (Gene3D link)

Proteins where this domain is known:
MAL13P1.113    MAL13P1.16    MAL8P1.21    PF11_0052    PF14_0500    PFC0582c    PFC0890w    PFE1505w    PFL0505c   


G3DSA:1.20.5.320 - Fibritin/6PGD_C-extension (Gene3D link)

Interpro entry IPR012284 : (Interpro link)

Interpro description:

6-phosphogluconate dehydrogenase catalyses the oxidative decarboxylation of 6-phosphogluconate to ribulose 5-phosphate with the concomitant reduction of NADP to NADPH. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP), which functions to generate ribose 5-phosphate for nucleotide and nucleic acid synthesis. Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequences are highly conserved. The protein is a homodimer in which the monomers act independently: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet. NADP is bound in a cleft in the small domain, and the substrate binds in an adjacent pocket.

This entry represents the terminal 30-40 residues of 6-phosphogluconate dehydrogenase C-terminal domain, which is lacking in certain 6PGD enzymes. The core of the C-terminal domain is represented by This region bears structural resemblance to the C-terminal portion of the Bacteriophage T4 fibritin protein, which is responsible for the attachment of long tail fibres to virus particles, and forms the, "whiskers", or fibres on the neck of the virion.

Proteins where this domain is known:
PF14_0520   


G3DSA:1.20.58.280 - G3DSA:1.20.58.280 (Gene3D link)

Proteins where this domain is known:
PF14_0548   


G3DSA:1.20.58.90 - G3DSA:1.20.58.90 (Gene3D link)

Proteins where this domain is known:
PFA0460c   


G3DSA:1.20.80.10 - ACBP (Gene3D link)

Interpro entry IPR014352 : FERM/acyl-CoA-binding protein, 3-helical bundle (Interpro link)

Interpro description:

This entry represents a structural domain with a core structure consisting of a 3-helical closed bundle with a left-handed twist, in an up-and-down arrangement. This structural motif occurs as subdomain 2 within FERM domains, as well as in acyl-CoA-binding proteins. The FERM domain (band F ezrin-radixin-moesin homology domains) has such a structure, acting as a common membrane-binding module involved in localising proteins to the plasma membrane. Proteins containing FERM include cytoskeletal proteins such as erythrocyte membrane protein 4.1R, talin, and the ezrin-radixin-moesin protein family, as well as several protein tyrosine kinases and phosphatases, and the neurofibromatosis 2 tumour suppressor protein merlin. The ezrin-radixin-moesin protein family function is to crosslink the actin filaments of cytoskeletal structures to the plasma membrane.

In addition, acyl-CoA-binding protein (ACBP) contains a domain with a similar 3-helical bundle structure. ACBP plays an important role in fatty acid metabolism, maintaining a pool of fatty acyl-CoA molecules in the cell.

Proteins where this domain is known:
PF08_0099    PF10_0015    PF10_0016    PF11_0197    PF14_0749   


G3DSA:1.20.920.10 - Bromodomain (Gene3D link)

Interpro entry IPR001487 : (Interpro link)

Interpro description:
Bromodomains are found in a variety of mammalian, invertebrate and yeast DNA-binding proteins. Bromodomains can interact with acetylated lysine. In some proteins, the classical bromodomain has diverged to such an extent that parts of the region are either missing or contain an insertion (e.g., mammalian protein HRX, Caenorhabditis elegans hypothetical protein ZK783.4, yeast protein YTA7). The bromodomain may occur as a single copy, or in duplicate.

The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.

Proteins where this domain is known:
PF08_0034    PF10_0328    PF14_0724    PFA0510w    PFF1440w    PFL0635c    PFL1645w   


G3DSA:1.20.930.10 - TFIIS_N_fun-typ (Gene3D link)

Interpro entry IPR014754 : Transcription elongation factor, TFIIS, N-terminal, fungal-type (Interpro link)

Interpro description:

Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS. This entry contains predominantly fungal forms of TFIIS, which is encoded by the gene PPR2 in Saccharomyces cerevisiae (Baker's yeast). The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix.

Proteins where this domain is known:
PF07_0057   


G3DSA:1.20.940.10 - Prp18 (Gene3D link)

Interpro entry IPR004098 : Prp18 (Interpro link)

Interpro description:

The splicing factor Prp18 is required for the second step of pre-mRNA splicing. PRP18 appears to be primarily associated with the U5 snRNP.

The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles.

Proteins where this domain is known:
PFI1115c   


G3DSA:1.20.990.10 - G3DSA:1.20.990.10 (Gene3D link)

Proteins where this domain is known:
PF14_0478    PFI1140w   


G3DSA:1.25.10.10 - ARM-like (Gene3D link)

Interpro entry IPR011989 : Armadillo-like helical (Interpro link)

Interpro description:

This domain consists of a multi-helical fold comprised of two curved layers of alpha helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. This topology has been found with a number of repeats and domains, including the armadillo repeat (found in beta-catenins and importins), the HEAT repeat (found in protein phosphatase 2a and initiation factor eIF4G), the PHAT domain (found in Smaug RNA-binding protein), the leucine-rich repeat variant, the Pumilo repeat, and in the H regulatory subunit of V-type ATPases. The sequence similarity among these different repeats or domains is low, however they exhibit considerable structural similarity. Furthermore, the number of repeats present in the superhelical structure can vary between orthologues, indicating that rapid loss/gain of repeats has occurred frequently in evolution. A common phylogenetic origin has been proposed for the armadillo and HEAT repeats.

Proteins where this domain is known:
MAL13P1.105    MAL13P1.123    MAL13P1.26    MAL13P1.308    MAL7P1.164    MAL7P1.202    MAL8P1.123    PF08_0069    PF08_0087    PF10_0335    PF10_0351    PF11_0318    PF11_0463    PF11_0527    PF13_0013    PF13_0034    PF14_0196    PF14_0277    PF14_0304    PF14_0529    PF14_0540    PF14_0632    PFC0135c    PFC0375c    PFD0525w    PFD0720w    PFD0825c    PFE0935c    PFE1195w    PFE1400c    PFF0655c    PFF0830w    PFF1030w    PFF1345w    PFI0200c   


G3DSA:1.25.40.10 - TPR-like_helical (Gene3D link)

Interpro entry IPR011990 : Tetratricopeptide-like helical (Interpro link)

Interpro description:

This domain consists of a multi-helical fold comprised of two curved layers of alpha helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. This topology has been found with a number of repeats and domains, including the tetratricopeptide repeat (TPR) (found in kinesin light chains, SNAP regulatory proteins, clathrin heavy chains and bacterial aspartyl-phosphate phosphatases), and the pentatricopeptide repeat (PPR) (RNA-processing proteins). The TPR is likely to be an ancient repeat, since it is found in eukaryotes, bacteria and archaea, whereas the PPR repeat is found predominantly in higher plants. The superhelix formed from these repeats can bind ligands at a number of different regions, and has the ability to acquire multiple functional roles.

Proteins where this domain is known:
MAL13P1.139    MAL13P1.18    MAL13P1.274    MAL13P1.52    MAL8P1.60    PF07_0026    PF11_0101    PF11_0108    PF11_0124    PF11_0433    PF13_0107    PF13_0190    PF13_0231    PF14_0031    PF14_0098    PF14_0196    PF14_0324    PF14_0462    PFB0190c    PFB0610c    PFC0515c    PFC0550w    PFD0180c    PFE0085c    PFE0445c    PFE1370w    PFE1545c    PFF0080c    PFF0490w    PFF1505w    PFL0280c    PFL0615w    PFL0930w    PFL2015w    PFL2120w    PFL2275c   


G3DSA:1.25.40.120 - Prenyl_trans (Gene3D link)

Interpro entry IPR008940 : (Interpro link)

Interpro description:

Protein prenyltransferases catalyze the transfer of the carbon moiety of C15 farnesyl pyrophosphate or geranylgeranyl pyrophosphate synthase to a conserved cysteine residue in a CaaX motif of protein and peptide substrates. The addition of a farnesyl group is required to anchor proteins to the cell membrane. In the 3D structure of a mammalian Ras farnesyltransferases (Ftase), both subunits are largely composed of alpha-helices. The alpha-2 to alpha-15 helices in the alpha subunit fold into a novel helical hairpin structure, resulting in a crescent-shape domain that envelopes part of the subunit. The 12 helices of the beta-subunit form an alpha-alpha barrel. Six additional helices connect the inner core of helices and form the outside of the helical barrel. A deep cleft surrounded by hydrophobic amino acids in the centre of the barrel is proposed as the FPP-binding pocket. A single Zn2+ ion is located at the junction between the hydrophilic surface groove near the subunit interface.

Proteins where this domain is known:
PF14_0403    PFL2050w   


G3DSA:1.25.40.150 - ATPase_V1_H_C (Gene3D link)

Interpro entry IPR011987 : ATPase, V1 complex, subunit H, C-terminal (Interpro link)

Interpro description:

ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

This entry represents the C-terminal domain of subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.

More information about this protein can be found at Protein of the Month: ATP Synthases.

Proteins where this domain is known:
PF13_0034   


G3DSA:1.25.40.180 - MIF4-like_typ_1/2/3 (Gene3D link)

Interpro entry IPR016021 : MIF4-like, type 1/2/3 (Interpro link)

Interpro description:

This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Family members contain two or more structurally similar domains of this fold connected by unstructured linkers; this entry covers types 1, 2 and 3 MIF4G-like domains. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low.

The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80. The complex binds to 5'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans.

Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.

Nonsense-mediated mRNA decay (NMD) in eukaryotes involves UPF1, UPF2 and UPF3 to accelerate the decay rate of two unique classes of transcripts: (1) nonsense mRNAs that arise through errors in gene expression, and (2) naturally occurring transcripts that lack coding errors but have built-in features that target them for accelerated decay (error-free mRNAs). NMD can trigger decay during any round of translation and can target CBC-bound or eIF-4E-bound transcripts. UPF2 contains MIF4G domains, while UPF3 contains an RNP domain.

Proteins where this domain is known:
MAL13P1.352    MAL13P1.63    PF11_0086    PFI1265w    PFL1855w   


G3DSA:1.25.40.20 - ANK (Gene3D link)

Interpro entry IPR002110 : (Interpro link)

Interpro description:

The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.

The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.

Proteins where this domain is known:
MAL13P1.126    MAL13P1.71    MAL8P1.28    PF10_0102    PF10_0213    PF10_0328    PF11_0197    PF11_0439    PF14_0106    PF14_0222    PF14_0690    PFC0160w    PFE0400w    PFF1315w    PFF1365c    PFL2200w   


G3DSA:1.25.40.250 - Transl_init_fac_sub12_N_euk (Gene3D link)

Interpro entry IPR016020 : Translation initiation factor 3, subunit 12, N-terminal, eukaryotic (Interpro link)

Interpro description:

This entry represents the N-terminal domain found in several eukaryotic translation initiation factor 3 subunit 12 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multi-subunit complex that is required for binding of mRNA to 40S ribosomal subunits, stabilisation of ternary complex binding to 40S subunits, and dissociation of 40S and 60S subunits.

Proteins where this domain is known:
PFC0441c   


G3DSA:1.25.40.260 - TFIIS/CRSP70_N (Gene3D link)

Interpro entry IPR014765 : Transcription elongation factor, TFIIS/CRSP70, N-terminal (Interpro link)

Interpro description:

Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS (predominantly the metazoan and plant forms), and CRSP70. The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix. CRSP70 is an essential subunit of the CRSP complex, which is required for the activity of the enhancer-binding protein Sp1.

Proteins where this domain is known:
PFI0285w   


G3DSA:1.25.40.30 - Clathrin_H_link (Gene3D link)

Interpro entry IPR012331 : (Interpro link)

Interpro description:

Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.

This entry represents the alpha-helical zigzag linker region connecting the conserved N-terminal beta-propeller region to the C-terminal alpha-alpha-superhelical region in clathrin heavy chains.

More information about these proteins can be found at Protein of the Month: Clathrin.

Proteins where this domain is known:
PFL0930w   


G3DSA:1.25.40.60 - G3DSA:1.25.40.60 (Gene3D link)

Proteins where this domain is known:
PFF0665c   


G3DSA:1.25.40.70 - PI3Ka (Gene3D link)

Interpro entry IPR001263 : Phosphoinositide 3-kinase accessory region PIK (Interpro link)

Interpro description:

Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The role of the accessory domain of phosphoinositide 3-kinase (PI3-kinase) is unclear. It may be involved in substrate presentation .

Proteins where this domain is known:
PFE0765w   


G3DSA:1.25.40.80 - G3DSA:1.25.40.80 (Gene3D link)

Proteins where this domain is known:
PFE0675c   


G3DSA:1.25.40.90 - ENTH_VHS (Gene3D link)

Interpro entry IPR008942 : (Interpro link)

Interpro description:

This entry represents domains with a multi-helical, alpha-alpha 2-layered structural fold as found in: the ENTH domain of Epsin; the VHS domain of Hrs, Tom1, and ADP-ribosylation factors; the RPR domain of PCF11 protein; and the N-terminal domain of phosphoinositide-binding clathrin adaptor.

The epsin NH2-terminal homology (ENTH) domain is a membrane interacting module composed of a superhelix of alpha-helices. It is present at the NH2-terminus of proteins that often contain consensus sequences for binding to clathrin coat components and their accessory factors, and therefore function as endocytic adaptors. ENTH domain containing proteins have additional roles in signalling and actin regulation and may have yet other actions in the nucleus. The ENTH domain is structurally similar to the VHS domain.

The ENTH domain is approximately 150 amino acids long. The ENTH domain forms a compact globular structure, composed of eight alpha-helices connected by loops of varying length. Three helical hairpins that are stacked consecutively with a right-handed twist determine the general topology of the domain. This stacking gives the ENTH domain a rectangular appearance when viewed face on. The most highly conserved amino acids fall roughly into two classes: internal residues that are involved in packing and therefore are necessary for structural integrity, and solvent accessible residues that may be involved in protein-protein interactions.

VHS domains are found at the N-termini of select proteins involved in intracellular membrane trafficking. The domain consists of eight helices arranged in a superhelix. The surface of the domain has two main features: a basic patch on one side due to several conserved positively charged residues on helix 3 and a negatively charged ridge on the opposite side, formed by residues on helix 2. Comparison of the two VHS domains and the ENTH domain reveals a conserved surface, composed of helices 2 and 4, that is utilised for protein-protein interactions. In addition, VHS domain-containing proteins are also often localized to membranes. It has therefore been suggested that the conserved positively charged surface of helix 3 in VHS and ENTH domains plays a role in membrane binding.

Proteins where this domain is known:
PFL2195w   


G3DSA:1.50.10.20 - G3DSA:1.50.10.20 (Gene3D link)

Proteins where this domain is known:
PF11_0483    PFF0120w    PFL0695c   


G3DSA:2.10.109.10 - Pept_S24_S26_C (Gene3D link)

Interpro entry IPR011056 : (Interpro link)

Interpro description:

Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

This entry represents the C-terminal domain of the Escherichia coli LexA protein and the C-terminal domain of the E. coli signal peptidase (SPase). They share the same structural topology, consisting of a complex fold made of several coiled beta-sheets, and containing an SH3-like beta-barrel. This entry is associated with serine peptidases belong to MEROPS peptidase families: S24 (LexA family, clan SF); S26A (signal peptidase I) and S26B (signalase).

The S26 family includes E. coli signal peptidase, SPase, which is a membrane-bound endopeptidase, with two N-terminal transmembrane segments and a C-terminal catalytic region. SPase functions to release proteins that have been translocated into the inner membrane from the cell interior, by cleaving off their signal peptides.

The S24 family includes:

All of these proteins, with the possible exception of RulA, interact with RecA, which activates self cleavage either derepressing transcription in the case of CI and LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. UmuD'2, is the homodimeric component of DNA pol V, which is produced from UmuD by RecA-facilitated self-cleavage. The first 24 N-terminal residues of UmuD are removed; UmuD'2 is a DNA lesion bypass polymerase. MucA, like UmuD, is a plasmid encoded a DNA polymerase (pol RI) which is converted into the active lesion-bypass polymerase by a self-cleavage reaction involving RecA

This group of proteins also contains proteins not recognised as peptidases as well as those classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.

Proteins where this domain is known:
PF13_0118   


G3DSA:2.10.230.10 - HSP_DnaJ_cys-rich (Gene3D link)

Interpro entry IPR001305 : Heat shock protein DnaJ, cysteine-rich region (Interpro link)

Interpro description:

Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolyzing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.

Proteins where this domain is known:
PFD0462w   


G3DSA:2.10.25.10 - G3DSA:2.10.25.10 (Gene3D link)

Proteins where this domain is known:
PFF1120c    PFI1475w   


G3DSA:2.10.70.10 - Complement_control_module (Gene3D link)

Interpro entry IPR016060 : (Interpro link)

Interpro description:

This entry represents complement control protein (CCP) modules, which are also known as sushi domains or short consensus repeats (SCR). The CCP module is a disulphide-rich domain with an all-beta fold. These domains are found in a wide variety of complement and adhesion proteins, such as complement receptors Cr1 and Cr2, complement C1R and C1S proteases, and complement decay-accelerating factor CD55, as well as in mannan-binding lectin serine protease 2 (MASP-2), GABA-B receptor 1, beta2-glycoprotein, membrane cofactor protein CD46, and as the 15th and 16th modules of Factor H.

Proteins where this domain is known:
PFD0295c   


G3DSA:2.102.10.10 - Rieske_reg (Gene3D link)

Interpro entry IPR005806 : Rieske [2Fe-2S] region (Interpro link)

Interpro description:

Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential, which is linked to ATP synthesis. The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), present in cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.

The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines that one Fe ion while the other Fe ion is coordinated by two conserved histidines. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.

Proteins where this domain is known:
PF07_0085    PF14_0373   


G3DSA:2.120.10.30 - 6-blade_b-propeller_TolB-like (Gene3D link)

Interpro entry IPR011042 : (Interpro link)

Interpro description:

This entry represents a six-bladed beta-propeller domain consisting of six 4-stranded beta-sheet motifs. This domain can be found in TolB proteins (C-terminal), in soluble quinoprotein glucose dehydrogenase, in calcium-dependent phosphotriesterases, in the low density lipoprotein (LDL) receptor YWTD domain, in nidogen, and in serine/threonine-protein kinase (PknD) NHL repeat domain.

TolB is a periplasmic protein from Escherichia coli that is part of the Tol-dependent translocation system involving group A and E colicins that is used to penetrate and kill cells. TolB has two domains, an alpha-helical N-terminal domain that shares structural similarity with the C-terminal domain of transfer RNA ligases, and a beta-propeller C-terminal domain that shares structural similarity with numerous members of the prolyl oligopeptidase family and, to a lesser extent, to class B metallo-beta-lactamases (although its does not necessarily occur at the C-terminal in these proteins). The C-terminal domain of TolB may mediate protein-protein interactions with colicins.

Proteins where this domain is known:
PFL1065c   


G3DSA:2.120.10.80 - Kelch-typ_b-propeller (Gene3D link)

Interpro entry IPR015915 : (Interpro link)

Interpro description:

Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.

The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.

This entry represents the 6-bladed Kelch beta-propeller, which consists of six 4-stranded beta-sheet motifs (or six Kelch repeats).

Proteins where this domain is known:
MAL7P1.137    PF10_0179    PF10_0219    PF11_0240    PF11_0267    PF11_0268    PF11_0326    PF13_0238    PF14_0630    PF14_0649    PFL0270c    PFL0530c    PFL0650c   


G3DSA:2.130.10.10 - WD40/YVTN_repeat-like (Gene3D link)

Interpro entry IPR015943 : (Interpro link)

Interpro description:

This entry represents a WD40/YVTN repeat-like domain. Both the WD40 and the YVTN repeated motifs consist of about 40 residues, and although they consist of distinct sequences, they do share a similar structure. Structurally, both the WD40 and the YVTN repeated motifs form seven-bladed propellers (although some members can contain eight blades), which consist of seven 4-stranded beta-sheets.

The WD40-type repeat domain is found in the beta-1 subunit of the signal-transducing G protein, in yeast Tup1 protein, in Groucho, in the yeast cell cycle protein Cdc4 and in actin-interacting protein 1.

The YVTN-type repeat domain is found in archaeal surface layer proteins (SLPs) that protect cells from extreme environments, in quinohemoprotein amine dehydrogenase (QHNDH), and in methylamine dehydrogenase.

Proteins where this domain is known:
MAL13P1.142    MAL13P1.148    MAL13P1.245    MAL13P1.264    MAL13P1.385    MAL13P1.54    MAL13P1.79    MAL7P1.81    MAL8P1.145    MAL8P1.43    PF07_0017    PF07_0092    PF07_0106    PF08_0019    PF08_0065    PF08_0130    PF08_0135    PF10_0045    PF10_0126    PF10_0128    PF10_0196    PF10_0261    PF10_0285    PF10_0326    PF11_0056    PF11_0171    PF11_0195    PF11_0222    PF11_0275    PF11_0400    PF11_0471    PF13_0149    PF13_0184    PF13_0250    PF13_0309    PF13_0335    PF14_0055    PF14_0062    PF14_0087    PF14_0101    PF14_0243    PF14_0263    PF14_0314    PF14_0412    PF14_0456    PF14_0565    PF14_0640    PFA0520c    PFB0640c    PFB0700c    PFC0100c    PFC0365w    PFC0965w    PFD0455w    PFE0090w    PFE0505w    PFE0540w    PFE0885w    PFE0930w    PFE1270c    PFE1310c    PFF0330w    PFF0395c    PFF1000w    PFF1480w    PFI0275w    PFI0290c    PFI1080w    PFL0470w    PFL0610w    PFL0920c    PFL0970w    PFL1040w    PFL1175w    PFL1290w    PFL1395c    PFL1470c    PFL1480w    PFL1820w    PFL1975c    PFL2105c    PFL2460w   


G3DSA:2.130.10.110 - Clathrin_H-chain_link/propller (Gene3D link)

Interpro entry IPR016025 : Clathrin, heavy chain, linker and propeller (Interpro link)

Interpro description:

Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.

This entry represents a region covering the N-terminal beta-propeller region of clathrin heavy chains that extends away from the hub of triskelia, and which is responsible for peptide binding, as well as the core motif for the alpha-helical zigzag linker region connecting the conserved N-terminal beta-propeller region to the C-terminal alpha-alpha-superhelical region in clathrin heavy chains.

More information about these proteins can be found at Protein of the Month: Clathrin.

Proteins where this domain is known:
PFL0930w   


G3DSA:2.130.10.30 - Reg_csome_cond/b-lactamase_inh (Gene3D link)

Interpro entry IPR009091 : (Interpro link)

Interpro description:

The beta-lactamase-inhibitor protein II (BLIP-II) is a secreted protein produced by the soil bacteria Streptomyces exfoliates SMF19. BLIP-II acts as a potent inhibitor of beta-lactamases such as TEM-1, which is the most widespread resistance enzyme to penicillin antibiotics. BLIP-II binds competitively to TEM-1, but no direct contacts are made with TEM-1 active site residues. BLIP-II shows no sequence similarity with BLIP, even though both bind to and inhibit TEM-1. However, BLIP-II does share significant sequence identity with the regulator of chromosome condensation (RCC1) family of proteins. These two families are clearly related, both having a seven-bladed beta-propeller structure, although they differ in the number of strands per blade, BLIP-II having three antiparallel beta-strands per blade, while RCC1 has four-stranded blades. RCC1 is a eukaryotic nuclear protein that acts as a guanine nucleotide exchange factor for Ran, a member of the Ras GTPase family. RCC1 mediates a Ran-GTP gradient necessary for the regulation of spindle formation and nuclear assembly during mitosis, as well as for the transport of macromolecules across the nuclear membrane during interphase.

Proteins where this domain is known:
MAL7P1.38    PF11_0385    PF11_0448    PF13_0303    PFD0145c    PFD0900w    PFE0420c    PFI0975c    PFI1500w    PFL0975w   


G3DSA:2.160.10.10 - G3DSA:2.160.10.10 (Gene3D link)

Proteins where this domain is known:
MAL8P1.68    PF07_0098    PF11_0460    PF14_0774    PFC0860w    PFD0260c    PFE1325w    PFL0675c   


G3DSA:2.160.20.60 - Glu_synthase_C (Gene3D link)

Interpro entry IPR002489 : Glutamate synthase, alpha subunit, C-terminal (Interpro link)

Interpro description:

Glutamate synthase (GltS) is a complex iron-sulphur flavoprotein that catalyses the reductive synthesis of L-glutamate from 2-oxoglutarate and L-glutamine via intramolecular channelling of ammonia, a reaction in the bacterial, yeast and plant pathways for ammonia assimilation. GltS is a multifunctional enzyme that functions through three distinct active centres carrying out multiple reaction steps: L-glutamine hydrolysis, conversion of 2-oxoglutarate into L-glutamate, and electron uptake from an electron donor. The active centres are synchronised to avoid the wasteful consumption of L-glutamine. There are three classes of GltS, which share many functional properties: bacterial NADPH-dependent GltS, ferredoxin-dependent GltS from photosynthetic cells, and NAD(P)H-dependent GltS from yeast, fungi and lower animals.

The dimeric alpha subunits each consist of four domains: N-terminal amidotransferase domain, the central domain, the FMN binding domain and the C-terminal domain. The C-terminal domain forms a right-handed beta-helix that comprises seven helical turns. Each helical turn has a sharp bend that is associated with a repeated sequence motif consisting of G-XX-G-XXX-G. This domain does not contain any residues directly involved in catalysis, but has a crucial structural role.

This domain is also found in proteins such as subunit C of formylmethanofuran dehydrogenase, which catalyses the first step in methane formation from carbon dioxide in methanogenic archaea. There are two isoenzymes of formylmethanofuran dehydrogenase: a tungsten-containing isoenzyme (FwdC) and a molybdenum-containing isoenzyme (FmdC). The tungsten isoenzyme is constitutively transcribed, whereas transcription of the molybdenum operon is induced by molybdate.

Proteins where this domain is known:
PF14_0334   


G3DSA:2.160.20.70 - CAP/MinC_C (Gene3D link)

Interpro entry IPR016098 : (Interpro link)

Interpro description:

Cyclase-associated proteins (CAPs) are highly conserved monomeric actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways. Only yeast CAPs are involved in adenylate cyclase activation. The C-terminal domain of CAP proteins is responsible for G-actin-binding that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes.

In Escherichia coli, three Min proteins (MinC, MinD and MinE) negatively regulate FtsZ assembly at the cell poles in order to ensure the Z-ring only assembles at cell midpoint. MinC inhibits formation of the Z-ring by preventing FtsZ assembly. MinD binds to MinC near the cell poles, sequestering MinC away from the cell midpoint so the Z-ring can form there. MinC is an oligomer, probably a dimer, that consists of two domains: the N-terminal domain is responsible for FtsZ inhibition, while the C-terminal domain is responsible for binding to MinD and to a component of the division septum.

This entry represents a structural domain found at the C-terminal of CAP proteins as well as MinC. This domain has a superhelical structure, where the superhelix turns are made of either two (CAP) or three (MinC) beta-strands each.

Proteins where this domain is known:
PFA0260c   


G3DSA:2.170.11.10 - TopoI_DNA-bd_mixed-a/b_euk (Gene3D link)

Interpro entry IPR013030 : DNA topoisomerase I, DNA binding, mixed alpha/beta motif, eukaryotic-type (Interpro link)

Interpro description:

DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

This entry represents a structural motif, consisting of a complex alpha/beta topology that forms the N-terminal DNA-binding domain of certain eukaryotic topoisomerase I (type IB) enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.

More information about this protein can be found at Protein of the Month: DNA Topoisomerases.

Proteins where this domain is known:
PFE0520c   


G3DSA:2.170.120.12 - RNAP_insert (Gene3D link)

Interpro entry IPR011262 : DNA-directed RNA polymerase, insert (Interpro link)

Interpro description:

DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha (RpoA) subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta') can interact.

The dimerisation domains differ between the different subunit families. In eukaryotic Rpb3, archaeal D and bacterial RpoA subunits, the dimerisation domain is comprised of a central insert domain, which interrupts an Rpb11-like domain, dividing it into two halves. In eukaryotic Rpb11 and archaeal L subunits, the insert domain is lacking, leaving the Rpb11-like domain intact and contiguous.

Proteins where this domain is known:
PF11_0445    PF13_0040    PF14_0695    PFI1130c   


G3DSA:2.170.130.20 - LCCL (Gene3D link)

Interpro entry IPR004043 : (Interpro link)

Interpro description:

The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices.

Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain.

Proteins where this domain is known:
PF14_0067    PF14_0532    PF14_0723    PFA0445w    PFI0185w   


G3DSA:2.170.150.10 - Mss4/transl-control_tumor (Gene3D link)

Interpro entry IPR011323 : (Interpro link)

Interpro description:

This entry represents a structural domain with a complex fold consisting of several coiled beta-sheets. This domain exists as a duplication, consisting of a tandem repeat of two similar structural motifs. This entry represents copies of this structural motif in the following protein families:

Mss4 is a conserved accessory factor for Rab GTPases, which function as ubiquitous regulators of intracellular membrane trafficking. Mss4 acts to promote nucleotide release from exocytic but not endocytic Rab GTPases. Mss4 has a complex fold made of several coiled beta-sheets, and consists of a duplication of tandem repeats of two similar structural motifs. It contains a zinc-binding site.

Other proteins that show structural similarity to Mss4 include the translationally controlled tumour-associated proteins TCTPs, which contain an insertion of an alpha helical hairpin, and lack the zinc-binding site. TCTPs are a highly conserved and abundantly expressed family of eukaryotic proteins that are implicated in both cell growth and the human acute allergic response.

Proteins where this domain is known:
PFE0545c   


G3DSA:2.170.270.10 - G3DSA:2.170.270.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.122    PF08_0012    PF13_0293    PFD0190w    PFF1440w    PFL0690c   


G3DSA:2.20.110.10 - G3DSA:2.20.110.10 (Gene3D link)

Proteins where this domain is known:
PF10_0306    PF11_0307    PF14_0586    PFB0230c    PFE0560c   


G3DSA:2.20.25.10 - G3DSA:2.20.25.10 (Gene3D link)

Proteins where this domain is known:
PF07_0057    PFA0505c    PFA0525w    PFB0290c    PFD0360w   


G3DSA:2.20.25.20 - Casein_kin_II_reg-sub_b-sht (Gene3D link)

Interpro entry IPR016150 : Casein kinase II, regulatory subunit, beta-sheet (Interpro link)

Interpro description:

Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.

This entry represents the C-terminal beta-sheet domain.

Proteins where this domain is known:
PF11_0048    PF13_0232   


G3DSA:2.20.25.30 - Ribosomal_L37ae/L37e_core (Gene3D link)

Interpro entry IPR011331 : Ribosomal protein L37ae/L37e, core (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

This entry represents the core domain of ribosomal proteins L37ae and L37e, which share a common rubredoxin-like metal-binding fold containing two CX(n)C motifs (where n is usually two).

Proteins where this domain is known:
MAL7P1.320    PFB0455w   


G3DSA:2.20.25.50 - G3DSA:2.20.25.50 (Gene3D link)

Proteins where this domain is known:
PF13_0324   


G3DSA:2.20.28.30 - G3DSA:2.20.28.30 (Gene3D link)

Proteins where this domain is known:
MAL13P1.213   


G3DSA:2.20.70.10 - G3DSA:2.20.70.10 (Gene3D link)

Proteins where this domain is known:
PFL1745c   


G3DSA:2.30.130.10 - G3DSA:2.30.130.10 (Gene3D link)

Proteins where this domain is known:
PF14_0174    PFE1470w    PFI0365w   


G3DSA:2.30.140.10 - G3DSA:2.30.140.10 (Gene3D link)

Proteins where this domain is known:
PF11_0301   


G3DSA:2.30.170.20 - Ribosomal_L24E (Gene3D link)

Interpro entry IPR000988 : Ribosomal protein L24e (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian ribosomal protein L24; yeast ribosomal protein L30A/B (Rp29) (YL21); Kluyveromyces lactis ribosomal protein L30; Arabidopsis thaliana ribosomal protein L24 homolog; Haloarcula marismortui ribosomal protein HL21/HL22; and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1201. These proteins have 60 to 160 amino-acid residues.

Proteins where this domain is known:
PF13_0049    PFE0300c   


G3DSA:2.30.18.10 - TFIIA_betabarrel (Gene3D link)

Interpro entry IPR009088 : Transcription factor IIA, beta-barrel (Interpro link)

Interpro description:

Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP, and can dissociate HMGB1 already bound to TBP/TATA-box.

Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2.

This entry represents the beta-barrel domain found at the C-terminal of both TOA1 (or alpha/beta) and TOA2 (or gamma) subunits of TFIIA, and their homologues.

Proteins where this domain is known:
MAL7P1.78   


G3DSA:2.30.180.10 - BIgH3_FAS1 (Gene3D link)

Interpro entry IPR000782 : (Interpro link)

Interpro description:

The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals; related FAS1 domains are also found in bacteria.

The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.

FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.

Proteins known to contain a FAS1 domain include:

The FAS1 domains of both human periostin and BIgH3 proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues. Gamma-carboxyglutamate residues are more commonly associated with GLA domains, where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.

Proteins where this domain is known:
PF14_0446   


G3DSA:2.30.22.10 - GrpE_head (Gene3D link)

Interpro entry IPR009012 : GrpE nucleotide exchange factor, head (Interpro link)

Interpro description:

In prokaryotes, the nucleotide exchange factor GrpE and the chaperone DnaJ are required for nucleotide binding of the molecular chaperone DnaK. The DnaK reaction cycle involves rapid peptide binding and release, which is dependent upon nucleotide binding. DnaJ accelerates the hydrolysis of ATP by DnaK, which enables the ADP-bound DnaK to tightly bind peptide. GrpE catalyses the release of ADP from DnaK, which is required for peptide release. In eukaryotes, GrpE is essential for mitochondrial Hsp70 function, however the cytosolic Hsp70 homologues are GrpE-independent.

GrpE binds as a homodimer to the ATPase domain of DnaK, and may interact with the peptide-binding domain of DnaK. GrpE accomplishes nucleotide exchange by opening the nucleotide-binding cleft of DnaK. GrpE is comprised of two domains, the N-terminal coiled coil domain, which may facilitate peptide release, and the C-terminal head domain, which forms part of the contact surface with the ATPase domain of DnaK. The head domain is comprised of six short beta strands with a limited hydrophobic core.

Proteins where this domain is known:
PF11_0258   


G3DSA:2.30.29.30 - PH_type (Gene3D link)

Interpro entry IPR011993 : (Interpro link)

Interpro description:

Pleckstrin homology (PH) domains are small modular domains that occur once, or occasionally several times, in a large variety of signalling proteins, where they serve as simple targeting domains that recognize only phosphoinositide headgroups. PH domains can target their host protein to the plasma and internal membranes through its association with phosphoinositides. PH domains have a partly opened beta-barrel topology that is capped by an alpha helix. Proteins containing PH domains include pleckstrin (N-terminal), phospholipase C delta-1, beta-spectrin, dynamin, son-of-sevenless, Grp1, Unc-89, Tapp1 and Rac-alpha kinase.

The structure of PH domains is similar to the phosphotyrosine-binding domain (PTB) found in IRS-1 (insulin receptor substrate 1), Shc adaptor and Numb; to the Ran-binding domain, found in Nup nuclear pore complex and Ranbp1; to the Enabled/VASP homology domain 1 (EVH1 domain), found in Enabled, VASP (vasodilator-stimulated phosphoprotein), Homer and WASP actin regulatory protein; to the third domain of FERM, found in moesin, radixin, ezrin, merlin and talin; and to the PH-like domain of neurobeachin.

Proteins where this domain is known:
MAL13P1.256    MAL13P1.306    PF10_0132    PF10_0189    PF11_0242    PF11_0327    PFB0257c    PFD0207c    PFD0950w   


G3DSA:2.30.29.40 - G3DSA:2.30.29.40 (Gene3D link)

Proteins where this domain is known:
PF11_0252   


G3DSA:2.30.30.100 - G3DSA:2.30.30.100 (Gene3D link)

Proteins where this domain is known:
MAL13P1.253    MAL8P1.48    MAL8P1.9    PF08_0049    PF11_0255    PF11_0266    PF11_0280    PF11_0524    PF13_0142    PF14_0146    PF14_0411    PFB0865w    PFE1020w    PFI0475w    PFL0460w   


G3DSA:2.30.30.190 - G3DSA:2.30.30.190 (Gene3D link)

Proteins where this domain is known:
PFI0335w   


G3DSA:2.30.30.200 - Ribosomal_L24_SH3-like (Gene3D link)

Interpro entry IPR014723 : (Interpro link)

Interpro description:

Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

Several prokaryotic and eukaryotic proteins that are involved in the translation process contain an SH3-like domain, consisting of a partly opened beta barrel, where the last strand is interrupted by a 3-10 helical turn. This entry represents the SH3-like beta barrel domain found in the ribosomal proteins L24 and L26, the structure of which has been determined for L24 from Archaea Haloarcula marismortui. The 50S subunit proteins function primarily to stabilize inter-domain interactions that are necessary to maintain the subunit's structural integrity, displaying a wide variety of protein-RNA interactions. Interactions between RNA and the SH3 domains appear to be mediated by the loops connecting the beta-strands and not the beta-barrel itself. L24 uses these loops between beta-strands to contact H19 and H24.

Proteins where this domain is known:
PFC0535w    PFF0245w   


G3DSA:2.30.30.210 - RNase_P_Rpp29 (Gene3D link)

Interpro entry IPR002730 : Ribonuclease P/MRP, p29 subunit, eukaryotic/archaeal (Interpro link)

Interpro description:

This entry represents the p29 subunit (also known as Rpp29 or Pop4) of the related ribonucleoproteins ribonuclease (RNase) P and RNase MRP, which can be found in both eukaryotes and arachea. The structure of the RNase P subunit, Rpp29, from Methanobacterium thermoautotrophicum has been determined. Mth Rpp29 is a member of the oligonucleotide/oligosaccharide binding fold family. It contains a structured beta-barrel core and unstructured N- and C-terminal extensions bearing several highly conserved amino acid residues that could be involved in RNA contacts in the protein-RNA complex. Rpp29 catalyses the endonucleolytic cleavage of RNA, removing 5'-extranucleotides from tRNA precursor. It interacts with the Rpp25 and Pop5 subunits.

RNase P is a ubiquitous ribonucleoprotein enzyme primarily responsible for cleaving the 5' leader sequence during maturation of tRNAs in all three domains of life. In eubacteria, this enzyme is made up of two subunits: a large RNA (approximately 120 kDa) responsible for mediating catalysis, and a small protein cofactor (approximately 15 kDa) that modulates substrate recognition and is required for efficient in vivo catalysis. In contrast, multiple proteins are associated with eukaryotic and archaeal RNase P, and these proteins exhibit no recognizable homology to the conserved bacterial protein subunit. In reconstitution experiments with recombinantly expressed and purified protein subunits Mth Rpp29, a homologue of the Rpp29 protein subunit from eukaryotic RNase P, is an essential protein component of the archaeal holoenzyme. In Saccharomyces cerevisiae (Baker's yeast), RNase P consists of 9 protein subunits (Pop1, Pop3-8, Rpr2 and Rpp1), while in humans there are 10 subunits (Rpp14, 20, 21, 25, 29, 30, 38, 40, hPop1, 5).

RNase MRP (mitochondrial RNA processing) is an rRNA processing enzyme that cleaves a specific site within precursor rRNA to generate the mature 5'-end of 5.8S rRNA. RNase MRP also cleaves primers for mitochondrial DNA replication and CLB2 mRNA. In yeast, RNase MRP possesses one putatively catalytic RNA and at least 9 protein subunits and is highly related to RNase P (Pop1, Pop3-Pop8, Rpp1, Snm1 and Rmp1).

Proteins where this domain is known:
PFF1355w   


G3DSA:2.30.30.30 - Ribosomal_L2 (Gene3D link)

Interpro entry IPR014722 : (Interpro link)

Interpro description:

The fundamental activity of the ribosome is two-fold: to decode the message of the mRNA in the small subunit, and to form a peptide bond between peptidyl-tRNA and aminoacyl-tRNA by a peptidyl transferase activity in the large subunit. Several prokaryotic and eukaryotic proteins that are involved in the translation process contain an SH3-like domain. The structure of the translation protein SH3-like domain is a partly opened beta barrel, where the last strand is interrupted by a 3-10 helical turn. The structure of the RNA-binding C-terminal domain of the Bacillus stearothermophilus (Geobacillus stearothermophilus) ribosomal protein L2 has been shown to adopt the SH3-like barrel topology. The L2 protein is located near the peptidyl transferase centre in the large ribosomal subunit where it may contribute to peptidyl transferase activity, and is involved in the assembly of the 23SrRNA. Likewise, the N-terminal domain of the ubiquitous eukaryotic initiation translation factor 5a (IF-5A) protein adopts the SH3-like barrel topology. IF-5A is involved in the initial step of peptide bond formation in translation and in cell-cycle regulation. IF-5A acts as a cofactor of the Rev protein in HIV-1-infected cells and of the Rex protein in T-cell leukaemia virus 1-infected cells.

This entry represents a subset of those identified in.

Proteins where this domain is known:
PF11_0337    PFL0210c   


G3DSA:2.30.30.70 - G3DSA:2.30.30.70 (Gene3D link)

Proteins where this domain is known:
PF14_0240   


G3DSA:2.30.33.40 - Chaprnin_Cpn10 (Gene3D link)

Interpro entry IPR001476 : Chaperonin Cpn10 (Interpro link)

Interpro description:

The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).

The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.

Proteins where this domain is known:
PF13_0180    PFL0740c   


G3DSA:2.30.38.10 - G3DSA:2.30.38.10 (Gene3D link)

Proteins where this domain is known:
MAL13P1.485    PF07_0129    PF14_0751    PF14_0761    PFB0695c    PFC0050c    PFD0085c    PFF0945c    PFF1350c    PFL0035c    PFL2570w   


G3DSA:2.30.42.10 - G3DSA:2.30.42.10 (Gene3D link)

Proteins where this domain is known:
MAL8P1.98    PFC0330w   


G3DSA:2.40.10.10 - no description (Gene3D link)

Proteins where this domain is known:
MAL8P1.126    MAL8P1.98   


G3DSA:2.40.10.170 - G3DSA:2.40.10.170 (Gene3D link)

Proteins where this domain is known:
PFL1725w   


G3DSA:2.40.100.10 - PPIase_cyclophilin (Gene3D link)

Interpro entry IPR002130 : Peptidyl-prolyl cis-trans isomerase, cyclophilin-type (Interpro link)

Interpro description:

Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.

  • Note: FKBP's, a family of proteins that bind the immunosuppressive drug FK506, are also PPIases, but their sequence is not at all related to that of cyclophilin.
  • Proteins where this domain is known:
    PF08_0121    PF08_0128    PF11_0164    PF11_0170    PF14_0223    PFC0975c    PFE0505w    PFE1430c    PFI1490c    PFL0120c    PFL0735w   


    G3DSA:2.40.150.20 - Ribosomal_L14 (Gene3D link)

    Interpro entry IPR000218 : Ribosomal protein L14b/L23e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.

    Proteins where this domain is known:
    PF13_0171    PFE0960w   


    G3DSA:2.40.20.10 - Kringle (Gene3D link)

    Interpro entry IPR000001 : (Interpro link)

    Interpro description:
    Kringles are autonomous structural domains, found throughout the blood clotting and fibrinolytic proteins. Kringle domains are believed to play a role in binding mediators (e.g., membranes, other proteins or phospholipids), and in the regulation of proteolytic activity. Kringle domains are characterised by a triple loop, 3-disulphide bridge structure, whose conformation is defined by a number of hydrogen bonds and small pieces of anti-parallel beta-sheet. They are found in a varying number of copies in some plasma proteins including prothrombin and urokinase-type plasminogen activator, which are serine proteases belonging to MEROPS peptidase family S1A.

    Proteins where this domain is known:
    PFI0550w   


    G3DSA:2.40.240.10 - G3DSA:2.40.240.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0170    PF13_0257   


    G3DSA:2.40.270.10 - G3DSA:2.40.270.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0358    PFB0715w    PFL0330c   


    G3DSA:2.40.30.10 - G3DSA:2.40.30.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.164    MAL13P1.243    PF07_0062    PF10_0272    PF11_0245    PF13_0304    PF13_0305    PF13_0353    PF14_0104    PF14_0486    PFA0495c    PFE0830c    PFF0115c    PFF0345w    PFF1115w    PFL1590c    PFL1710c   


    G3DSA:2.40.30.20 - G3DSA:2.40.30.20 (Gene3D link)

    Proteins where this domain is known:
    PFB0795w   


    G3DSA:2.40.30.30 - Riboflavin_kinase (Gene3D link)

    Interpro entry IPR015865 : Riboflavin kinase (Interpro link)

    Interpro description:

    Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases.

    This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.

    Proteins where this domain is known:
    MAL13P1.292   


    G3DSA:2.40.37.10 - G3DSA:2.40.37.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0322   


    G3DSA:2.40.40.20 - Asp_decarboxylase-like_fold (Gene3D link)

    Interpro entry IPR009010 : Aspartate decarboxylase-like fold (Interpro link)

    Interpro description:

    Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair.

    In many cases, including aspartate decarboxylase and aspartic proteinases, strands 1 and 4 are each bent and consist of two sections. The two sections normally make a right angle; sometimes their hydrogen-bond patterns are disrupted at the corner by a bulge or even by a large insertion. In these cases, the barrel can also be viewed as a pair of orthogonally packed sheets, each with four strands.

    Proteins where this domain is known:
    PF07_0047    PFF0940c   


    G3DSA:2.40.40.30 - RNA_pol_A (Gene3D link)

    Interpro entry IPR000722 : RNA polymerase, alpha subunit (Interpro link)

    Interpro description:

    RNA polymerases catalyse the DNA dependent polymerisation of RNA from DNA, using the four ribonucleoside triphosphates as substrates. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). Eukaryotic RNA polymerase I is essentially used to transcribe ribosomal RNA units, polymerase II is used for mRNA precursors, and III is used to transcribe 5S and tRNA genes. Each class of RNA polymerase is assembled from nine to fourteen different polypeptides. Members of the family include the largest subunit from eukaryotes; the gamma subunit from Cyanobacteria; the beta' subunit from bacteria; the A' subunit from archaea; and the B'' subunit from chloroplast RNA polymerases.

    Proteins where this domain is known:
    PF13_0150    PFC0805w    PFE0465c   


    G3DSA:2.40.50.100 - G3DSA:2.40.50.100 (Gene3D link)

    Proteins where this domain is known:
    PF10_0407    PF11_0339    PF13_0121    PF14_0664    PFC0170c   


    G3DSA:2.40.50.140 - OB_NA_bd_sub (Gene3D link)

    Interpro entry IPR012340 : (Interpro link)

    Interpro description:

    A five-stranded beta-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded beta-sheet coiled to form a closed beta-barrel capped by an alpha helix located between the third and fourth strands. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case.

    There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8.

    Proteins where this domain is known:
    MAL13P1.22    MAL13P1.327    MAL8P1.101    PF07_0023    PF07_0117    PF10_0269    PF10_0294    PF11_0130    PF11_0332    PF11_0337    PF11_0447    PF13_0095    PF13_0262    PF13_0291    PF14_0166    PF14_0177    PF14_0401    PF14_0585    PF14_0658    PFA0145c    PFA0470c    PFB0525w    PFC0290w    PFC0775w    PFD0470c    PFD0600c    PFD0790c    PFE0435c    PFE0845c    PFE1345c    PFI0235w    PFL0210c    PFL0580w    PFL0665c   


    G3DSA:2.40.50.40 - no description (Gene3D link)

    Proteins where this domain is known:
    PF11_0418    PFL1005c   


    G3DSA:2.40.50.90 - SNase (Gene3D link)

    Interpro entry IPR006021 : Staphylococcal nuclease (SNase-like) (Interpro link)

    Interpro description:

    Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.

    Proteins where this domain is known:
    PF11_0374   


    G3DSA:2.40.70.10 - Pept_Aspartc_cat (Gene3D link)

    Interpro entry IPR009007 : Peptidase aspartic, catalytic (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    These aspartate proteases all contain a common closed beta barrel structure, which includes pepsin, cathepsin, chymosin, beta-secretase, plasmepsin, plant acid proteases and retroviral proteases.

    Proteins where this domain is known:
    PF08_0108    PF10_0329    PF13_0133    PF14_0075    PF14_0076    PF14_0077    PF14_0078    PF14_0281    PF14_0625    PFC0495w    PFL1660c   


    G3DSA:2.60.11.10 - COX5B (Gene3D link)

    Interpro entry IPR002124 : Cytochrome c oxidase, subunit Vb (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion. Two of these cysteines are clustered in the C-terminal section of the subunit.

    Proteins where this domain is known:
    PFI1365w   


    G3DSA:2.60.120.10 - RmlC-like_jellyroll (Gene3D link)

    Interpro entry IPR014710 : (Interpro link)

    Interpro description:

    This entry represents domains with a double-stranded beta-helix jelly roll fold such as that found in RmlC (deoxythimodone diphosphates-4-dehydrorhamnose 3,5-epimerase;, a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria.

    Other protein families contain domains that share this jelly roll fold, including glucose-6-phosphate isomerase; germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities; auxin-binding protein; seed storage protein 7S; acireductone dioxygenase; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase, phosphomannose isomerase and homogentisate dioxygenase, the last three sharing a 2-domain fold with storage protein 7s.

    The cAMP-binding domains found in the cAMP receptor protein (CRP) family display a similar double-stranded beta-helix jelly roll fold. These proteins include CooA, a CO-sensing haem protein that functions as a transcription activator, and the CnbD (cyclic nucleotide binding domain) of the HCN cation channel in which cAMP binding modulates gating of the channel.

    Proteins where this domain is known:
    MAL8P1.156    PF14_0172    PF14_0173    PF14_0346    PFL1110c   


    G3DSA:2.60.120.200 - ConA_like_subgrp (Gene3D link)

    Interpro entry IPR013320 : (Interpro link)

    Interpro description:

    Lectins and glucanases exhibit the common property of reversibly binding to specific complex carbohydrates. The lectins/glucanases are a diverse group of proteins found in a wide range of species from prokaryotes to humans. The different family members all contain a concanavalin A-like domain, which consists of a sandwich of 12-14 beta strands in two sheets with a complex topology. Members of this family are diverse, and include the lectins: legume lectins, cereal lectins, viral lectins, and animal lectins. Plant lectins function in the storage and transport of carbohydrates in seeds, the binding of nitrogen-fixing bacteria to root hairs, the inhibition of fungal growth or insect feeding, and in hormonally regulated plant growth. Protein members include concanavalin A (Con A), favin, isolectin I, lectin IV, soybean agglutinin and lentil lectin. Animal lectins include the galectins, which are S-type lactose-binding and IgE-binding proteins such as S-lectin, CLC protein, galectin1, galectin2, galectin3 CRD, and Congerin I.

    Other members with a Con A-like domain include the glucanases. Bacterial and fungal beta-glucanases, such as Bacillus 1-3,1-4-beta-glucanse, carry out the acid catalysis of beta-glucans found in microorganisms and plants. Similarly, kappa-Carrageenase degrades kappa-carrageenans from marine red algae cell walls.

    This entry differs from by omitting the xylanases and glycosyl hydrolases.

    Proteins where this domain is known:
    PFA0195w   


    G3DSA:2.60.120.260 - no description (Gene3D link)

    Proteins where this domain is known:
    PF07_0120    PF14_0384    PF14_0532    PF14_0723    PFL0850w   


    G3DSA:2.60.120.320 - G3DSA:2.60.120.320 (Gene3D link)

    Proteins where this domain is known:
    PFI1195c   


    G3DSA:2.60.120.470 - G3DSA:2.60.120.470 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.159   


    G3DSA:2.60.15.10 - ATPase_F1_d/e (Gene3D link)

    Interpro entry IPR001469 : ATPase, F1 complex, delta/epsilon subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF11_0485   


    G3DSA:2.60.200.20 - FHA (Gene3D link)

    Interpro entry IPR000253 : (Interpro link)

    Interpro description:

    The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.

    To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.

    Proteins where this domain is known:
    MAL13P1.405    PF11_0347    PF13_0042    PFI0470w    PFL0275w   


    G3DSA:2.60.260.20 - G3DSA:2.60.260.20 (Gene3D link)

    Proteins where this domain is known:
    PF14_0359    PFA0660w    PFB0090c    PFB0595w    PFD0462w    PFE0055c    PFF1415c   


    G3DSA:2.60.300.12 - HesB_yadR_yfhF (Gene3D link)

    Interpro entry IPR000361 : (Interpro link)

    Interpro description:

    The proteins in this entry are variously annotated as iron-sulphur cluster insertion protein or Fe/S biogenesis protein. They appear to be involved in Fe-S cluster biogenesis. This family includes IscA, HesB, YadR and YfhF-like proteins. The hesB gene is expressed only under nitrogen fixation conditions. IscA, an 11 kDa member of the hesB family of proteins, binds iron and [2Fe-2S] clusters, and participates in the biosynthesis of iron-sulphur proteins. IscA is able to bind at least 2 iron ions per dimer. Other members of this family include various hypothetical proteins that also contain the NifU-like domain suggesting that they too are able to bind iron and are involved in Fe-S cluster biogenesis. The HesB family are found in species as divergent as Homo sapiens (Human) and Haemophilus influenzae suggesting that these proteins are involved in basic cellular functions.

    Proteins where this domain is known:
    PFB0320c    PFC1005c    PFE1135w   


    G3DSA:2.60.34.10 - G3DSA:2.60.34.10 (Gene3D link)

    Proteins where this domain is known:
    MAL7P1.228    PF08_0054    PF11_0351    PFI0875w   


    G3DSA:2.60.370.10 - CtaG_Cox11 (Gene3D link)

    Interpro entry IPR007533 : Cytochrome c oxidase assembly protein CtaG/Cox11 (Interpro link)

    Interpro description:
    Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C-terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae . Met 224 is also thought to play a role in copper transfer or stabilising the copper site.

    Proteins where this domain is known:
    PF14_0721   


    G3DSA:2.60.40.10 - Ig-like_fold (Gene3D link)

    Interpro entry IPR013783 : (Interpro link)

    Interpro description:

    This entry represents domains with an immunoglobulin-like (Ig-like) fold, which consists of a beta-sandwich of seven or more strands in two sheets with a Greek-key topology. Ig-like domains are one of the most common protein modules found in animals, occurring in a variety of different proteins. These domains are often involved in interactions, commonly with other Ig-like domains via their beta-sheets. Domains within this fold-family share the same structure, but can diverge with respect to their sequence. Based on sequence, Ig-like domains can be classified as V-set domains (antibody variable domain-like), C1-set domains (antibody constant domain-like), C2-set domains, and I-set domains (antibody intermediate domain-like). Proteins can contain more than one of these types of Ig-like domains. For example, in the human T-cell receptor antigen CD2, domain 1 (D1) is a V-set domain, while domain 2 (D2) is a C2-set domain, both domains having the same Ig-like fold.

    Domains with an Ig-like fold can be found in many, diverse proteins in addition to immunoglobulin molecules. For example, Ig-like domains occur in several different types of receptors (such as various T-cell antigen receptors), several cell adhesion molecules, MHC class I and II antigens, as well as the hemolymph protein hemolin, and the muscle proteins titin, telokin and twitchin.

    Proteins where this domain is known:
    PFF0685c   


    G3DSA:2.60.40.1030 - Clathrin_a-adaptin_app_Ig-like (Gene3D link)

    Interpro entry IPR013038 : Clathrin adaptor, alpha-adaptin, appendage, Ig-like subdomain (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-adaptin from AP2 clathrin adaptor complexes. This subdomain has an immunoglobulin-like beta-sandwich fold containing 7 strands in 2 beta-sheets in a Greek key topology. Alpha-adaptin has a hinge region and an ear domain. The appendage domain can bind directly to clathrin and accessory proteins forming an interconnected network, and can regulate the translocation of several endocytic accessory proteins to the bud site. The N-terminal domain of the alpha subunit binds to PtdIns(4,5)P2 and has been implicated in the recruitment of AP2 to the plasma membrane.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PFF0830w    PFL2220w   


    G3DSA:2.60.40.1150 - Clathrin_b-adaptin_app_Ig-like (Gene3D link)

    Interpro entry IPR013037 : Clathrin adaptor, beta-adaptin, appendage, Ig-like subdomain (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of gamma1-adaptin from AP1 clathrin adaptor complex, and the homologous C-terminal GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 8 strands in 2 beta-sheets in a Greek key topology. This is a similar fold to that found in alpha- and beta-adaptins, but there is little sequence identity between them. The GAE domain is involved in the recruitment of accessory proteins, such as gamma-synergin, Rababptin-5, Eps15 and cyclin G-associated kinase, which modulate the functions of GAE domain containing proteins in the membrane trafficking events. The binding site in GAE for accessory proteins is located in a shallow hydrophobic trough surrounded by charged (mainly basic) residues.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PFE1400c   


    G3DSA:2.60.40.1170 - G3DSA:2.60.40.1170 (Gene3D link)

    Proteins where this domain is known:
    PF11_0202    PF13_0062    PF14_0386    PFL0885w   


    G3DSA:2.60.40.1230 - Clathrin_g-adaptin_app (Gene3D link)

    Interpro entry IPR008153 : Clathrin adaptor, gamma-adaptin, appendage (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of gamma1-adaptin from AP1 clathrin adaptor complex, and the homologous C-terminal GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 8 strands in 2 beta-sheets in a Greek key topology. This is a similar fold to that found in alpha- and beta-adaptins, but there is little sequence identity between them. The GAE domain is involved in the recruitment of accessory proteins, such as gamma-synergin, Rababptin-5, Eps15 and cyclin G-associated kinase, which modulate the functions of GAE domain containing proteins in the membrane trafficking events. The binding site in GAE for accessory proteins is located in a shallow hydrophobic trough surrounded by charged (mainly basic) residues.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF14_0529   


    G3DSA:2.60.40.1420 - G3DSA:2.60.40.1420 (Gene3D link)

    Proteins where this domain is known:
    PFD0250c   


    G3DSA:2.60.40.1480 - Coatomer_gsu_app_Ig-like-sub (Gene3D link)

    Interpro entry IPR013040 : Coatomer, gamma subunit, appendage, Ig-like subdomain (Interpro link)

    Interpro description:

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.

    This entry represents a beta-sandwich structural motif found in the appendage domain of the gamma subunit of coatomer complexes. This subdomain has an immunoglobulin-like beta-sandwich fold containing 7 strands in 2 beta-sheets in a Greek key topology. The appendage domain of the gamma coatomer subunit has a similar overall fold to the appendage domain of clathrin adaptors, and can also share the same motif-based cargo recognition and accessory factor recruitment mechanisms.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0463   


    G3DSA:2.60.40.1490 - Anti-silence (Gene3D link)

    Interpro entry IPR006818 : Histone chaperone, ASF1-like (Interpro link)

    Interpro description:

    This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.

    Proteins where this domain is known:
    PFL1180w   


    G3DSA:2.60.40.150 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.134    PF11_0107    PF14_0530   


    G3DSA:2.60.40.360 - MSP (Gene3D link)

    Interpro entry IPR008962 : (Interpro link)

    Interpro description:

    The PapD-like superfamily of periplasmic chaperones directs the assembly of over 30 diverse adhesive surface organelles that mediate the attachment of many different pathogenic bacteria to host tissues, a critical early step in the development of disease. PapD, the prototypical chaperone, is necessary for the assembly of P pili. P pili contain the adhesin PapG, which mediates the attachment of uropathogenic Escherichia coli to Gal(alpha) Gal receptors present on kidney cells and are critical for the initiation of pyelonephritis. The PapD-like chaperones consist of two Ig-like domains oriented toward each other, forming L-shaped molecules. In the chaperone-subunit complex, the G1beta strand of the chaperone completes an atypical Ig fold in the subunit by occupying the groove and running parallel to the subunit C-terminal F strand. This donor strand complementation interaction simultaneously stabilizes pilus subunits and caps their interactive surfaces, preventing their premature oligomerisation in the periplasm. During pilus biogenesis, the highly conserved N-terminal extension of one subunit has been proposed to displace the chaperone G1beta strand from its neighbouring subunit in a mechanism termed donor strand exchange.

    This entry represents the immunoglobulin (Ig)-like beta-sandwich domain found in PapD, as well as in other periplasmic chaperone proteins that include FimC and SfaE from E. coli, and Caf1m from Yersinia pestis. In addition, major sperm proteins (MSP) and other related sperm proteins (such as WR4 and SSP-19) contain an Ig-like domain with a similar structural fold to PapD. Major sperm proteins are central components in molecular interactions underlying sperm motility, with many isoforms existing in Caenorhabditis elegans.

    Proteins where this domain is known:
    PF14_0377   


    G3DSA:2.60.40.420 - Cupredoxin (Gene3D link)

    Interpro entry IPR008972 : (Interpro link)

    Interpro description:

    Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as Cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.

    This entry represents cupredoxin proteins, as well as structural homologues to cupredoxin. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Proteins with a cupredoxin-type fold are found in the following family groups:

    Proteins where this domain is known:
    PF14_0288   


    G3DSA:2.60.40.790 - G3DSA:2.60.40.790 (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.78    MAL8P1.96    PF13_0021    PF13_0204    PF14_0510    PFC0581w    PFI0990c    PFI1325w    PFL0550w    PFL1765c    PFL1845c   


    G3DSA:2.60.60.20 - Lipase_LipOase (Gene3D link)

    Interpro entry IPR001024 : (Interpro link)

    Interpro description:

    Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:

    The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.

    This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.

    Proteins where this domain is known:
    PF14_0067   


    G3DSA:2.70.150.10 - G3DSA:2.70.150.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.246    PF07_0115    PFE0195w    PFE0805w   


    G3DSA:2.70.160.11 - G3DSA:2.70.160.11 (Gene3D link)

    Proteins where this domain is known:
    PF08_0092    PF14_0242   


    G3DSA:2.70.210.12 - GTP1_OBG_sub (Gene3D link)

    Interpro entry IPR006169 : GTP1/OBG subdomain (Interpro link)

    Interpro description:

    Several proteins have recently been shown to contain the 5 structural motifs characteristic of GTP-binding proteins. These include murine DRG protein; GTP1 protein from Schizosaccharomyces pombe; OBG protein from Bacillus subtilis; and several others. Although the proteins contain GTP-binding motifs and are similar to each other, they do not share sequence similarity to other GTP-binding proteins, and have thus been classed as a novel group, the GTP1/OBG family. As yet, the functions of these proteins is uncertain, but they have been shown to be important in development and normal cell metabolism.

    Proteins where this domain is known:
    MAL8P1.33    PF14_0114    PFF0385c   


    G3DSA:2.70.40.10 - G3DSA:2.70.40.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0282   


    G3DSA:2.80.10.50 - G3DSA:2.80.10.50 (Gene3D link)

    Proteins where this domain is known:
    PF10_0104   


    G3DSA:3.10.110.10 - UBQ-conjugat_E2 (Gene3D link)

    Interpro entry IPR016135 : (Interpro link)

    Interpro description:

    This entry represents a structural domain with an alpha-beta(4)-alpha(3) core fold. Domains of this structure are found in:

    Proteins where this domain is known:
    MAL13P1.227    MAL8P1.41    PF08_0085    PF10_0330    PF13_0301    PF14_0128    PFC0255c    PFC0855w    PFE1350c    PFF0305c    PFI0740c    PFI1030c    PFL0190w    PFL2100w    PFL2175w   


    G3DSA:3.10.120.10 - Cyt_B5 (Gene3D link)

    Interpro entry IPR001199 : Cytochrome b5 (Interpro link)

    Interpro description:
    Cytochromes b5 are ubiquitous electron transport proteins found in animals, plants and yeasts. The microsomal and mitochondrial variants are membrane-bound, while those from erythrocytes and other animal tissues are water-soluble.

    The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.

    Proteins where this domain is known:
    PF14_0266    PFI0885w    PFL1555w   


    G3DSA:3.10.129.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF11_0364    PF13_0128   


    G3DSA:3.10.180.10 - G3DSA:3.10.180.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0145    PFF0230c   


    G3DSA:3.10.20.230 - G3DSA:3.10.20.230 (Gene3D link)

    Proteins where this domain is known:
    PFE0890c   


    G3DSA:3.10.20.30 - Ferredoxin_fold (Gene3D link)

    Interpro entry IPR012675 : (Interpro link)

    Interpro description:

    This domain has a beta-grasp fold with a core structure consisting of beta(2)-alpha-beta(2), which is similar to that found in ubiquitin. Domains with this type of structure are found in the 2Fe-2S ferredoxin family (including putidaredoxin and adrenodoxin), the 2Fe-2S ferredoxin-related family (including aldehyde reductase, and xanthine dehydrogenase), the TGS family (including threonyl-tRNA synthetase) and the MoaD/ThiS family (including molybdopterin, and thiamine biosynthesis sulphur carrier protein).

    Proteins where this domain is known:
    MAL13P1.95    MAL7P1.122    PFL0630w    PFL0705c   


    G3DSA:3.10.20.90 - G3DSA:3.10.20.90 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.64    MAL8P1.122    PF08_0067    PF10_0114    PF10_0193    PF11_0142    PF11_0329    PF11_0393    PF13_0084    PF13_0346    PF14_0027    PFE0285c    PFE1355c    PFI1085w    PFL0585w    PFL1830w   


    G3DSA:3.10.200.10 - Euk_COanhd (Gene3D link)

    Interpro entry IPR001148 : (Interpro link)

    Interpro description:

    Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.

    This entry represents alpha class carbonic anhydrases.

    More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.

    Proteins where this domain is known:
    PF11_0410    PF11_0411   


    G3DSA:3.10.250.10 - G3DSA:3.10.250.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0067   


    G3DSA:3.10.280.10 - MAM33 (Gene3D link)

    Interpro entry IPR003428 : Mitochondrial glycoprotein (Interpro link)

    Interpro description:
    This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular 'heads' of C1Q.

    Proteins where this domain is known:
    PF14_0329   


    G3DSA:3.10.290.10 - G3DSA:3.10.290.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0181    PF14_0584    PFE1005w   


    G3DSA:3.10.300.10 - PurDNA_glycsylse (Gene3D link)

    Interpro entry IPR003180 : Methylpurine-DNA glycosylase (MPG) (Interpro link)

    Interpro description:

    Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension.

    Proteins where this domain is known:
    PF14_0639   


    G3DSA:3.10.330.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF07_0047    PFC0140c    PFF0940c   


    G3DSA:3.10.330.30 - Ribosomal_S27E (Gene3D link)

    Interpro entry IPR000592 : Ribosomal protein S27e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families include mammalian, yeast, Chlamydomonas reinhardtii and Entamoeba histolytica S27, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0250. These proteins have from 62 to 87 amino acids. They contain, in their central section, a putative zinc-finger region of the type C-x(2)-C-x(14)-C-x(2)-C.

    Proteins where this domain is known:
    PF13_0045   


    G3DSA:3.10.370.10 - G3DSA:3.10.370.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.76   


    G3DSA:3.10.440.10 - Ribosomal_L31e (Gene3D link)

    Interpro entry IPR000054 : Ribosomal protein L31e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:

  • Yeast L34
  • Archaeal L31
  • Plants L31
  • Mammalian L31
  • Proteins where this domain is known:
    PFE0185c   


    G3DSA:3.10.450.50 - G3DSA:3.10.450.50 (Gene3D link)

    Proteins where this domain is known:
    PF14_0122   


    G3DSA:3.10.450.80 - Ribosomal_L44E (Gene3D link)

    Interpro entry IPR000552 : Ribosomal protein L44e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian, Trypanosoma brucei, Caenorhabditis elegans and fungal L44, and Haloarcula marismortui LA.

    Proteins where this domain is known:
    PFC0200w   


    G3DSA:3.10.50.40 - G3DSA:3.10.50.40 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.68    PFL2275c   


    G3DSA:3.100.10.10 - G3DSA:3.100.10.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.209    PFF0885w   


    G3DSA:3.20.10.10 - G3DSA:3.20.10.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0557   


    G3DSA:3.20.100.10 - mRNA_capping_enz_bsu (Gene3D link)

    Interpro entry IPR004206 : mRNA capping enzyme, beta subunit (Interpro link)

    Interpro description:
    The mRNA capping enzyme in yeasts is composed of two separate subunits, a mRNA guanyltransferase and an RNA 5'-triphosphate. This is the beta subunit of mRNA capping enzyme which has triphosphatase activity. The beta chain (polynucleotide 5'-phosphatase converts the 5'-triphosphate end of a nascent mRNA chain into a diphosphate in the first step of mRNA capping. The function of the capping enzyme also depends on the guanylyltransferase activity conferred by the alpha chain.

    Proteins where this domain is known:
    PFC0980c   


    G3DSA:3.20.130.10 - TtdB_fumA_fumB (Gene3D link)

    Interpro entry IPR004647 : Fe-S type hydro-lyases tartrate/fumarate beta region (Interpro link)

    Interpro description:

    A number of Fe-S cluster-containing hydro-lyases share a conserved motif, including argininosuccinate lyase, adenylosuccinate lyase, aspartase, class I fumarate hydratase (fumarase), and tartrate dehydratase (see. Proteins in this group represent a subset of closely related proteins or modules, including the Escherichia coli tartrate dehydratase beta chain and the C-terminal region of the class I fumarase (where the N-terminal region is homologous to the tartrate dehydratase alpha chain). The activity of the archaeal proteins in this group is unknown.

    Proteins where this domain is known:
    PFI1340w   


    G3DSA:3.20.140.10 - G3DSA:3.20.140.10 (Gene3D link)

    Proteins where this domain is known:
    PFF1410c   


    G3DSA:3.20.19.10 - Aconitase/3IPM_dehydase_swvl (Gene3D link)

    Interpro entry IPR015928 : Aconitase/3-isopropylmalate dehydratase, swivel (Interpro link)

    Interpro description:

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents the 'swivel' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the N-terminal region following the HEAT-like domain in bacterial AcnB. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    G3DSA:3.20.20.10 - G3DSA:3.20.20.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0322    PFI0965w   


    G3DSA:3.20.20.100 - Aldo/ket_red (Gene3D link)

    Interpro entry IPR001395 : Aldo/keto reductase (Interpro link)

    Interpro description:

    The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.

    Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.

    Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.

    Proteins where this domain is known:
    MAL13P1.324    PF14_0088   


    G3DSA:3.20.20.105 - tRNA_ribo_trans (Gene3D link)

    Interpro entry IPR002616 : Queuine/other tRNA-ribosyltransferase (Interpro link)

    Interpro description:
    This is a family of queuine, archaeosine and general tRNA-ribosyltransferases also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine at position 34 and with archaeosine at position 15 in archaeal tRNAs. In bacterial it catalyses the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues.

    Proteins where this domain is known:
    PF07_0071    PF14_0322    PFL2030w   


    G3DSA:3.20.20.120 - G3DSA:3.20.20.120 (Gene3D link)

    Proteins where this domain is known:
    PF10_0155   


    G3DSA:3.20.20.140 - G3DSA:3.20.20.140 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.146    PF10_0289    PF14_0697    PFA0580c   


    G3DSA:3.20.20.150 - Xyl_isomerase-like_TIM-brl (Gene3D link)

    Interpro entry IPR013022 : (Interpro link)

    Interpro description:

    This entry represents a structural motif with a beta/alpha TIM barrel found in several proteins families:

    These proteins share similar, but not identical, metal-binding sites. In addition, xylose isomerase and L-rhamnose isomerase each have additional alpha-helical domains involved in tetramer formation. This entry differs from IPR012307 in having a wider coverage of TIM-barrel protein families.

    Proteins where this domain is known:
    PF13_0176   


    G3DSA:3.20.20.190 - PLC-like_Pdiesterase_TIM-brl (Gene3D link)

    Interpro entry IPR017946 : PLC-like phosphodiesterase, TIM beta/alpha-barrel domain (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of a TIM beta/alpha-barrel. These domains are found in several phospholipase C (PLC) like phosphodiesterases, including:

    Phospholipase C (PLC) isozymes are directly activated by heterotrimeric G proteins and Ras-like GTPases to hydrolyze phosphatidylinositol 4,5-bisphosphate into the second messengers diacylglycerol and inositol 1,4,5-trisphosphate. PLC enzymes often play central roles in various signalling cascades.

    Proteins where this domain is known:
    PF10_0132    PF14_0060   


    G3DSA:3.20.20.20 - Dhdropt_synth (Gene3D link)

    Interpro entry IPR000489 : Dihydropteroate synthase, DHPS (Interpro link)

    Interpro description:

    All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. Proteins containing this domain include dihydropteroate synthase as well as a group of methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulphur protein methyltransferase (MeTr)that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation.

    Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).

    Proteins where this domain is known:
    PF08_0095   


    G3DSA:3.20.20.210 - no description (Gene3D link)

    Proteins where this domain is known:
    PFF0360w   


    G3DSA:3.20.20.330 - S_methyl_trans (Gene3D link)

    Interpro entry IPR003726 : Homocysteine S-methyltransferase (Interpro link)

    Interpro description:
    S-methylmethionine: homocysteine methyltransferasefrom Escherichia coli accepts selenohomocysteine as a substrate. S-methylmethionine is an abundant plant product that can be utilised for methionine biosynthesis. Human methionine synthase (5-methyltetrahydrofolate:L-homocysteine S-transmethylase; shares 53 and 63% identity with the E. coli and the presumptive Caenorhabditis elegans proteins, respectively, and contains all residues implicated in B12 binding to the E. coli protein. Betaine--homocysteine S-methyltransferase converts betaine and homocysteine to dimethylglycine and methionine, respectively. This reaction is also required for the irreversible oxidation of choline.

    Proteins where this domain is known:
    PFL1625w   


    G3DSA:3.20.20.60 - Pyrv/PenolPyrv_Kinase_cat (Gene3D link)

    Interpro entry IPR015813 : Pyruvate/Phosphoenolpyruvate kinase, catalytic core (Interpro link)

    Interpro description:

    Pyruvate kinase controls the exit from the glysolysis pathway, catalysing the transfer of phosphate from phosphooenolpyruvate (PEP) to ADP. Mammalian pyruvate kinase is a homotetramer, where each polypeptide subunit consists of four domains: N-terminal, A domain, B domain and C-terminal. Activation of the enzyme is believed to occur via the clamping down of the B domain onto the A domain to dehydrate the active site cleft. The N- and C-terminal domains are situated at inter-subunit contact sites, and could be involved in assembly and communication within the complex. The N-terminal domain has a TIM beta/alpha-barrel structure. Homologous TIM-barrel domains are found in the following proteins:

    Proteins where this domain is known:
    PF10_0363    PF14_0246    PFF1300w   


    G3DSA:3.20.20.70 - Aldolase_TIM (Gene3D link)

    Interpro entry IPR013785 : Aldolase-type TIM barrel (Interpro link)

    Interpro description:

    This entry represents the TIM beta/alpha barrel found in aldolase and in related proteins. This TIM barrel usually covers the entire protein structure. Proteins containing this TIM barrel domain include class I aldolases, class I DAHP synthases, class II fructose-bisphosphate aldolases (FBP aldolases), and 5-aminolaevulinate dehydratase (a hybrid of classes I and II aldolases).

    Proteins where this domain is known:
    MAL13P1.220    MAL13P1.319    PF10_0210    PF10_0225    PF14_0086    PF14_0334    PF14_0378    PF14_0381    PF14_0425    PFC0831w    PFF0160c    PFF0680c    PFI0920c    PFI1020c    PFL0960w   


    G3DSA:3.20.20.80 - Glyco_hydro_cat (Gene3D link)

    Interpro entry IPR013781 : Glycoside hydrolase, subgroup, catalytic core (Interpro link)

    Interpro description:

    O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.

    This entry represents the catalytic TIM beta/alpha barrel common to many different families of glycosyl hydrolases. Structures have been determined for several proteins containing this domain, including family 13 glycosyl hydrolases (such as alpha-amylase), beta-glycanases, family 1 glycosyl hydrolases (such as beta-glucosidase), type II chitinases, 1,4-beta-N-acetylmuraminidases, and beta-N-acetylhexosaminidases.

    More information about this protein can be found at Protein of the Month: alpha-Amylase.

    Proteins where this domain is known:
    MAL13P1.258    PFL2510w   


    G3DSA:3.20.90.10 - Tubby_C (Gene3D link)

    Interpro entry IPR000007 : (Interpro link)

    Interpro description:

    Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member, although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development.

    Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors.

    Proteins where this domain is known:
    PF14_0058   


    G3DSA:3.30.1010.10 - G3DSA:3.30.1010.10 (Gene3D link)

    Proteins where this domain is known:
    PFD0965W    PFE0485w   


    G3DSA:3.30.110.10 - IF3 (Gene3D link)

    Interpro entry IPR001288 : Initiation factor 3 (Interpro link)

    Interpro description:

    Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic protein that binds to the 30S ribosomal subunit. The chloroplast initiation factor IF-3(chl) is a protein that enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal 30s subunits in which the central section is evolutionary related to the sequence of bacterial IF-3.

    Proteins where this domain is known:
    MAL8P1.27   


    G3DSA:3.30.110.20 - G3DSA:3.30.110.20 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.233    PF10_0063   


    G3DSA:3.30.110.30 - Pro-tRNA-synth_II_C_arc/euk (Gene3D link)

    Interpro entry IPR016061 : Prolyl-tRNA synthetase, class II, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Prolyl tRNA synthetase exists in two forms, which are loosely related. The first form is present in the majority of eubacteria species. The second one, present in some eubacteria, is essentially present in archaea and eukaryota. Prolyl-tRNA synthetase belongs to class IIa.

    This domain is found at the C-terminal in archaeal and eukaryotic enzymes, as well as in certain bacterial ones.

    Proteins where this domain is known:
    PFL0670c   


    G3DSA:3.30.1130.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PFL1155w   


    G3DSA:3.30.1140.32 - Ribosomal_S3_C (Gene3D link)

    Interpro entry IPR001351 : Ribosomal protein S3, C-terminal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects, Caenorhabditis elegans and yeast. This entry is the C-terminal domain.

    Proteins where this domain is known:
    PF14_0627   


    G3DSA:3.30.1200.10 - DUF167 (Gene3D link)

    Interpro entry IPR003746 : (Interpro link)

    Interpro description:

    This entry describes proteins of unknown function. Structures for two of these proteins, YggU from Escherichia coli and MTH637 from the archaea Methanobacterium thermoautotrophicum, have been determined; they have a core 2-layer alpha/beta structure consisting of beta(2)-loop-alpha-beta(2)-alpha.

    Proteins where this domain is known:
    PF14_0542   


    G3DSA:3.30.1320.10 - Ribosomal_S16 (Gene3D link)

    Interpro entry IPR000307 : Ribosomal protein S16 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    S16 proteins have about 100 amino-acid residues.

    Proteins where this domain is known:
    PFE1560c   


    G3DSA:3.30.1330.10 - G3DSA:3.30.1330.10 (Gene3D link)

    Proteins where this domain is known:
    PFI0505c   


    G3DSA:3.30.1330.20 - Tubulin/FtsZ_2-layer-sand-dom (Gene3D link)

    Interpro entry IPR018316 : Tubulin/FtsZ, 2-layer sandwich domain (Interpro link)

    Interpro description:

    This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain.

    Proteins where this domain is known:
    PF08_0125    PF10_0084    PF14_0725    PFD1050w    PFI0180w   


    G3DSA:3.30.1330.30 - G3DSA:3.30.1330.30 (Gene3D link)

    Proteins where this domain is known:
    MAL7P1.118    PF10_0187    PF11_0250    PF14_0231    PFB0550w    PFB0855c    PFC0295c    PFC0405c    PFD0960c   


    G3DSA:3.30.1330.50 - MECDP_synthase_core (Gene3D link)

    Interpro entry IPR003526 : 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, core (Interpro link)

    Interpro description:

    This entry represents MECDP (2-C-methyl-D-erythritol 2,4-cyclodiphosphate) synthetase, an enzyme in the non-mevalonate pathway of isoprenoid synthesis, isoprenoids being essential in all organisms. Isoprenoids can also be synthesized through the mevalonate pathway. The non-mevolante route is used by many bacteria and human pathogens, including Mycobacterium tuberculosis and Plasmodium falciparum. This route appears to involve seven enzymes. MECDP synthetase catalyses the intramolecular attack by a phosphate group on a diphosphate, with cytidine monophosphate (CMP) acting as the leaving group to give the cyclic diphosphate product MEDCP. The enzyme is a trimer with three active sites shared between adjacent copies of the protein. The enzyme also has two metal binding sites, the metals playing key roles in catalysi.

    A number of proteins from eukaryotes and prokaryotes share this common N-terminal signature and appear to be involved in terpenoid biosynthesis. The ygbB protein is a putative enzyme of this type.

    Proteins where this domain is known:
    PFB0420w   


    G3DSA:3.30.1360.10 - G3DSA:3.30.1360.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0023    PF14_0150   


    G3DSA:3.30.1360.120 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.75a   


    G3DSA:3.30.1360.20 - Trans_pterinDh (Gene3D link)

    Interpro entry IPR001533 : Transcriptional coactivator/pterin dehydratase (Interpro link)

    Interpro description:

    DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.

    Proteins where this domain is known:
    PF11_0095a   


    G3DSA:3.30.1360.90 - RNA_pol_Rpb1_7 (Gene3D link)

    Interpro entry IPR007073 : RNA polymerase Rpb1, domain 7 (Interpro link)

    Interpro description:
    RNA polymerases catalyse the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 7, represents a mobile module of the RNA polymerase. Domain 7 interacts with the lobe domain of Rpb2.

    Proteins where this domain is known:
    PFC0805w   


    G3DSA:3.30.1370.10 - G3DSA:3.30.1370.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0115    PF14_0151    PF14_0661    PFB0370c    PFE0500c    PFF0250w    PFF1135w   


    G3DSA:3.30.1370.30 - G3DSA:3.30.1370.30 (Gene3D link)

    Proteins where this domain is known:
    PFC0735w   


    G3DSA:3.30.1370.40 - no description (Gene3D link)

    Proteins where this domain is known:
    PF11_0416    PF13_0233   


    G3DSA:3.30.1380.20 - no description (Gene3D link)

    Proteins where this domain is known:
    PF14_0358    PFD0895c   


    G3DSA:3.30.1390.10 - Ribosomal_L7/12_C/ClpS-like (Gene3D link)

    Interpro entry IPR014719 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents a domain found at the C-terminus of ribosomal proteins L7 and L12, and also in the adaptor protein ClpS, forming an alpha/beta sandwich.

    The L7 and L12 ribosomal proteins are part of the large 50S ribosomal subunit, and occur in four copies organised as two dimers. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post-translational modification of the addition of an acetyl group to the N terminus of L7.

    ClpS is an adaptor protein that influences protein degradation through its binding to the N-terminal domain of the chaperone ClpA in the ClpAP chaperone-protease pair. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins.

    Proteins where this domain is known:
    MAL13P1.111    PFB0545c    PFE1225w   


    G3DSA:3.30.1390.20 - G3DSA:3.30.1390.20 (Gene3D link)

    Proteins where this domain is known:
    PFC0300c   


    G3DSA:3.30.1430.10 - G3DSA:3.30.1430.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0272   


    G3DSA:3.30.1440.10 - Ribosomal_L5 (Gene3D link)

    Interpro entry IPR002132 : Ribosomal protein L5 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    L5 is a protein of about 180 amino-acid residues.

    Proteins where this domain is known:
    PF07_0079   


    G3DSA:3.30.1490.10 - G3DSA:3.30.1490.10 (Gene3D link)

    Proteins where this domain is known:
    MAL7P1.93    PFC0735w   


    G3DSA:3.30.1490.120 - G3DSA:3.30.1490.120 (Gene3D link)

    Proteins where this domain is known:
    PF10_0269    PF11_0058   


    G3DSA:3.30.1490.90 - G3DSA:3.30.1490.90 (Gene3D link)

    Proteins where this domain is known:
    PF13_0150   


    G3DSA:3.30.1520.10 - PX (Gene3D link)

    Interpro entry IPR001683 : Phox-like (Interpro link)

    Interpro description:

    The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.

    The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.

    Proteins where this domain is known:
    MAL7P1.108    PF07_0017   


    G3DSA:3.30.1550.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PFE0850c   


    G3DSA:3.30.1560.10 - Mago_nashi (Gene3D link)

    Interpro entry IPR004023 : Mago nashi protein (Interpro link)

    Interpro description:
    This family was originally identified in drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homologue has been shown to interact with an RNA binding protein, ribonucleoprotein rbm8. An RNAi knockout of the Caenorhabditis elegans homologue causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination but the protein is also found in hermaphrodites and other organisms without a sexual differentiation.

    Proteins where this domain is known:
    MAL7P1.139   


    G3DSA:3.30.160.20 - dsRNA-bd-like (Gene3D link)

    Interpro entry IPR014720 : Double-stranded RNA-binding-like (Interpro link)

    Interpro description:

    The double-stranded RNA-binding domain (dsRBD), which is found in a variety of proteins, shares a common structure with the N-terminal domain of ribosomal protein S5, namely an alpha-beta(3)-alpha structure that folds into two layers, alpha/beta. The dsRBD is found in a variety of functionally distinct proteins, including Drosophila staufen proteins (five copies of motif), dsRNA-dependent protein kinase pkr, and RNase III. Ribosomal protein S5 functions in the small ribosomal subunit, and in Escherichia coli has been shown to be important in the assembly and function of the 30S subunit.

    Proteins where this domain is known:
    PF14_0448   


    G3DSA:3.30.160.40 - Porphobil_deam (Gene3D link)

    Interpro entry IPR000860 : Tetrapyrrole biosynthesis, hydroxymethylbilane synthase (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.

    This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).

    Proteins where this domain is known:
    PFL0480w   


    G3DSA:3.30.170.10 - Cyclin-dep_kinase_reg-sub (Gene3D link)

    Interpro entry IPR000789 : Cyclin-dependent kinase, regulatory subunit (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle progression, and are required for the G1 and G2 stages of cell division. The proteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS), which is essential for their function. This regulatory subunit is a small protein of 79 to 150 residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known, while mammals have two highly related isoforms. The regulatory subunits exist as hexamers, formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual 12-stranded beta-barrel structure. Through the barrel centre runs a 12A diameter tunnel, lined by 6 exposed helix pairs. Six kinase units can be modelled to bind the hexameric structure, which may thus act as a hub for cyclin-dependent protein kinase multimerisation.

    Proteins where this domain is known:
    PFI1155w   


    G3DSA:3.30.1740.10 - Znf_PARP (Gene3D link)

    Interpro entry IPR001510 : Zinc finger, PARP-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.

    NAD(+) ADP-ribosyltransferase is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFL2440w   


    G3DSA:3.30.190.20 - Ribosomal_L1_2-a/b-sand (Gene3D link)

    Interpro entry IPR016094 : Ribosomal protein L1, 2-layer alpha/beta-sandwich (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). This entry represents the 2-layer sandwich domain.

    In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    Proteins where this domain is known:
    PF14_0391    PFL0500w   


    G3DSA:3.30.200.20 - G3DSA:3.30.200.20 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.185    MAL13P1.84    MAL7P1.18    PF07_0072    PF08_0044    PF11_0079    PF11_0096    PF11_0147    PF11_0220    PF11_0377    PF11_0464    PF14_0068    PF14_0320    PF14_0423    PF14_0476    PF14_0715    PFC0525c    PFC0755c    PFD0740w    PFF0750w    PFF1370w    PFI1280c   


    G3DSA:3.30.230.10 - Ribosomal_S5_D2-type_fold (Gene3D link)

    Interpro entry IPR014721 : (Interpro link)

    Interpro description:

    Domain 2 of the ribosomal protein S5 has a left-handed beta-alpha-beta fold that is found in numerous RNA/DNA-binding proteins, as well as in kinases from the GHMP kinase family. Proteins containing this beta-alpha-beta fold domain include:

    Proteins where this domain is known:
    MAL7P1.145    MAL7P1.66    PF08_0076    PF10_0041    PF11_0184    PF11_0382    PF14_0132    PF14_0316    PF14_0448    PF14_0486    PFE0150c    PFF0115c    PFL1590c    PFL1915w   


    G3DSA:3.30.30.50 - G3DSA:3.30.30.50 (Gene3D link)

    Proteins where this domain is known:
    PF10_0103   


    G3DSA:3.30.300.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF10_0123    PFI1090w   


    G3DSA:3.30.300.20 - KH_prok (Gene3D link)

    Interpro entry IPR015946 : (Interpro link)

    Interpro description:

    This entry represents prokaryotic-type K homology domains, as well as related domains that share the same 2-layer alpha/beta structure.

    The K homology domain is a common RNA-binding motif present in one or multiple copies in both prokaryotic and eukaryotic regulatory proteins. The KH motifs may act cooperatively to bind RNA in the case of multiple motifs, or independently in the case of single KH motif proteins. Prokaryotic (pKH) and eukaryotic (eKH) KH domains share a KH-motif, but have different topologies. The pKH domain has been found in a number of proteins, including the N-terminal domain of the S3 ribosomal protein, the C-terminal domain of Era GTPase and the two C-terminal domains of the NusA transcription factor. The structure of the pKH domain consists of a two-layer alpha/beta fold in the arrangement alpha/beta(2)/alpha/beta.

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    PF10_0035    PF14_0339    PFC0565w    PFL0835w    PFL1985c   


    G3DSA:3.30.300.30 - G3DSA:3.30.300.30 (Gene3D link)

    Proteins where this domain is known:
    PFF0945c    PFF1350c   


    G3DSA:3.30.300.90 - BolA (Gene3D link)

    Interpro entry IPR002634 : (Interpro link)

    Interpro description:
    This family consist of the morpho-protein BolA from Escherichia coli and its various homologs. In E. coli, over-expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5.

    Proteins where this domain is known:
    PFE0790c   


    G3DSA:3.30.310.10 - b_Adaptin_TBP_C (Gene3D link)

    Interpro entry IPR012295 : Beta2-adaptin/TATA-box binding, C-terminal (Interpro link)

    Interpro description:

    The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins. The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box, and interacts with transcription factors and regulatory proteins.

    The beta(2)-adaptor is one of four subunits that comprise the clathrin adaptor, which plays a central role in clathrin-mediated endocytosis by linking transmembrane receptors to be internalised to the clathrin lattice. The C-terminal domain of beta(2)-adaptor is the appendage or ear domain, which is involved in clathrin polymerisation.

    Even though the C-terminal of beta(2)-adaptin has a very low sequence identity with the C-terminal of the TATA-box binding protein, they do share structural similarities, namely a beta-alpha-beta(4)-alpha core structure.

    Proteins where this domain is known:
    PF14_0267    PFE0305w    PFE1400c   


    G3DSA:3.30.310.30 - AP2_A_adaptin_C (Gene3D link)

    Interpro entry IPR015873 : Clathrin alpha-adaptin/coatomer adaptor, appendage, C-terminal subdomain (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.

    Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.

    While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.

    This entry represents a subdomain of the appendage (ear) domain of alpha-adaptin from AP clathrin adaptor complexes, and the appendage domain of the gamma subunit of coatomer complexes. These domains have a three-layer arrangement, alpha-beta-alpha, with a bifurcated antiparallel beta-sheet. Although the appendage domains from AP adaptins and coatomers share a similar fold, there is little sequence identity between them. However, they also share similar motif-based cargo recognition and accessory factor recruitment mechanisms.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0463   


    G3DSA:3.30.360.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF14_0511    PF14_0598   


    G3DSA:3.30.390.10 - G3DSA:3.30.390.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0155   


    G3DSA:3.30.390.30 - Pyr_redox_dim (Gene3D link)

    Interpro entry IPR004099 : Pyridine nucleotide-disulphide oxidoreductase, dimerisation (Interpro link)

    Interpro description:

    This entry represents a dimerisation domain that is usually found at the C-terminal of both class I and class II oxidoreductases, as well as in NADH oxidases and peroxidases.

    Proteins where this domain is known:
    PF08_0066    PF14_0192    PFI1170c    PFL1550w   


    G3DSA:3.30.390.50 - CO_DH_flav_C (Gene3D link)

    Interpro entry IPR005107 : (Interpro link)

    Interpro description:

    Proteins containing this domain form structural complexes with other known families, such asand The carbon monoxide (CO) dehydrogenase of Oligotropha carboxidovorans is a heterotrimeric complex composed of a apoflavoprotein, a molybdoprotein, and an iron-sulphur protein. It can be dissociated with sodium dodecylsulphate. CO dehydrogenase catalyzes the oxidation of CO according to the following equation:

     CO + H2O = CO2 + 2e + 2H+ 

    Subunit S represents the iron-sulphur protein of CO dehydrogenase and is clearly divided into a C- and an N-terminal domain, each binding a [2Fe-2S] cluster.

    Proteins where this domain is known:
    PF13_0083   


    G3DSA:3.30.40.10 - Znf_RING/FYVE/PHD (Gene3D link)

    Interpro entry IPR013083 : (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents RING-, PHD-, and FYVE-type zinc finger domains, which share a common dimetal (zinc)-bound alpha/beta structural fold, as well as the non-zinc-containing U-box domain, which is similar to the RING zinc finger only lacking the metal ion-binding residues (U-box associated with multi-ubiquitination).

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.216    MAL13P1.405    MAL7P1.155    PF07_0026    PF08_0020    PF10_0046    PF10_0072    PF10_0079    PF10_0117    PF10_0276    PF11_0244    PF11_0429    PF13_0188    PF14_0054    PF14_0139    PF14_0215    PF14_0574    PFB0440c    PFC0365w    PFC0425w    PFC0510w    PFC0610c    PFC0740c    PFC0845c    PFD0765w    PFE0900w    PFE1490c    PFF0165c    PFF0355c    PFF1180w    PFF1185w    PFF1325c    PFI0470w    PFL0275w    PFL0440c    PFL1010c    PFL1705w   


    G3DSA:3.30.420.10 - G3DSA:3.30.420.10 (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.104    PF10_0165    PF13_0208    PF14_0112    PFD0590c    PFF1150w    PFF1470c   


    G3DSA:3.30.420.100 - G3DSA:3.30.420.100 (Gene3D link)

    Proteins where this domain is known:
    PF14_0230    PFF0650w   


    G3DSA:3.30.420.40 - G3DSA:3.30.420.40 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.540    MAL7P1.153    MAL7P1.228    PF07_0033    PF07_0077    PF08_0054    PF11_0047    PF11_0114    PF11_0351    PF14_0124    PF14_0218    PFA0190c    PFD0487c    PFE0255w    PFI0520w    PFI0875w    PFL2215w   


    G3DSA:3.30.420.60 - G3DSA:3.30.420.60 (Gene3D link)

    Proteins where this domain is known:
    PFB0550w   


    G3DSA:3.30.420.80 - Ribosomal_S11 (Gene3D link)

    Interpro entry IPR001971 : Ribosomal protein S11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.

    Proteins where this domain is known:
    PF14_0519    PFE0810c   


    G3DSA:3.30.428.10 - His_triad_motif (Gene3D link)

    Interpro entry IPR011151 : Histidine triad motif (Interpro link)

    Interpro description:

    The histidine triad motif (HIT) is related to the sequence H-phi-H-phi-H-phi-phi (where phi is a hydrophobic amino acid). Proteins containing HIT domains form a superfamily of nucleotide hydrolases and transferases that act on the alpha-phosphate of ribonucleotides. This entry covers two HIT-containing proteins families:

    Proteins where this domain is known:
    PF08_0059    PF14_0349   


    G3DSA:3.30.429.10 - Tautomerase (Gene3D link)

    Interpro entry IPR014347 : (Interpro link)

    Interpro description:

    Tautomerase superfamily members have a (beta-alpha-beta)2 structure in two layers, and use a similar mechanism of action involving an amino-terminal proline as a general base in a ket-enol tautomerisation reaction. Members of this superfamily include macrophage migration inhibitory factor (MIF) and related proteins such as D-dopachrome tautomerase; 4-oxalocronoate tautomerase and related enzymes such as trans-3-chloroacrylic acid dehalogenase; and 5-carboxymethyl-2-hydroxymuconate isomerase (CHMI).

    Macrophage migration inhibitory factor (MIF) is a key regulatory cytokine within innate and adaptive immune responses, capable of promoting and modulating the magnitude of the response. MIF is released from T-cells and macrophages, and it can regulate cytokine secretion and the expression of receptors involved in the immune response. MIF has been linked to various inflammatory diseases, such as rheumatoid arthritis and atherosclerosis.

    4-Oxalocrotonate tautomerase (4-OT) is a plasmid-encoded enzyme that catalyzes the isomerisation of beta,gamma-unsaturated enones to their alpha,beta-isomers. This enzyme is part of the plasmid-encoded catechol meta-fission pathway, which enables the bacteria to use various aromatic hydrocarbons as their sole sources of carbon and energy.

    5-carboxymethyl-2-hydroxymuconate isomerase (CHMI) is a trimeric enzyme involved in the homoprotocatechuate pathway in Escherichia coli. This enzyme catalyses the isomerisation of 5-carboxymethyl-2-hydroxymuconate (CHM) to 5-carboxymethyl-2-oxo-3-hexene-1,6-dioate (COHED).

    Proteins where this domain is known:
    PFL1420w   


    G3DSA:3.30.450.120 - G3DSA:3.30.450.120 (Gene3D link)

    Proteins where this domain is known:
    PF10_0195   


    G3DSA:3.30.450.40 - G3DSA:3.30.450.40 (Gene3D link)

    Proteins where this domain is known:
    PFB0510w   


    G3DSA:3.30.450.50 - Longin (Gene3D link)

    Interpro entry IPR010908 : Longin (Interpro link)

    Interpro description:

    VAMPs (and its homologue synaptobrevins) define a group of SNARE proteins that contain a C-terminal coiled-coil/SNARE domain, in combination with variable N-terminal domains that are used to classify VAMPs: those containing longin N-terminal domains (~150 aa) are referred to as longins, while those with shorter N-termini are referred to as brevins. Longins are the only type of VAMP protein found in all eukaryotes, suggesting that their longin domain is essential. The longin domain is thought to exert a regulatory function. Longin domains have been shown to share the same structural fold, a profilin-like globular domain consisting of a five-stranded antiparallel beta-sheet that is sandwiched by an alpha-helix on one side, and two alpha-helices on the other (beta(2)-alpha-beta(3)-alpha(2)).

    Proteins where this domain is known:
    MAL13P1.135    PFC0890w    PFI0515w   


    G3DSA:3.30.450.60 - G3DSA:3.30.450.60 (Gene3D link)

    Proteins where this domain is known:
    PF11_0187    PF11_0202    PF11_0359    PF13_0062    PF14_0386    PFB0805c    PFD0745c    PFD1090c    PFL0885w    PFL2425w   


    G3DSA:3.30.450.70 - G3DSA:3.30.450.70 (Gene3D link)

    Proteins where this domain is known:
    PF13_0174    PFC0445w   


    G3DSA:3.30.460.10 - G3DSA:3.30.460.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0212   


    G3DSA:3.30.470.20 - ATP_grasp_subdomain_2 (Gene3D link)

    Interpro entry IPR013816 : ATP-grasp fold, subdomain 2 (Interpro link)

    Interpro description:

    The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyze the formation of amide bonds, catalyzing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, amongst others. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.

    The ATP-grasp fold shares functional and structural similarities with the PIPK (phosphatidylinositol phosphate kinase) and protein kinase superfamilies. The ATP-grasp domain consists of two subdomains with different alpha+beta folds, which grasp the ATP molecule between them. Each subdomain provides a variable loop that forms part of the active site, with regions from other domains also contributing to the active site, even though these other domains are not conserved between the various ATP-grasp enzymes. This entry represents subdomain 2 found at the C-terminal end of the ATP-grasp domain (the N-terminal subdomain is represented by.

    Proteins where this domain is known:
    PF13_0044    PF14_0295    PF14_0664    PFE0605c   


    G3DSA:3.30.470.30 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.22   


    G3DSA:3.30.479.10 - G3DSA:3.30.479.10 (Gene3D link)

    Proteins where this domain is known:
    PFF1360w   


    G3DSA:3.30.479.20 - Transl_elong_EFTs/EF1B_dimer (Gene3D link)

    Interpro entry IPR014039 : Translation elongation factor EFTs/EF1B, dimerisation (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    This entry represents the C-terminal dimerisation domain found primarily in EF-Tu (EF1A) proteins from bacteria, mitochondria and chloroplasts.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PFC0225c   


    G3DSA:3.30.499.10 - Acnase/IPM_dHydase_lsu_aba_1/3 (Gene3D link)

    Interpro entry IPR015931 : Aconitase/3-isopropylmalate dehydratase large subunit, alpha/beta/alpha, subdomains 1 and 3 (Interpro link)

    Interpro description:

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents a domain with an alpha/beta/alpha topology. This structural domain usually occurs in triplicate, with domains 1 and 3 being the most closely related since they share the same pseudo 2-fold symmetry. This entry represents domains 1 and 3. This triple domain region is found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB; in each case, this region binds the [4Fe-4S]-cluster. This triple domain region is also found in the large subunit of isopropylmalate dehydratase (LeuC).

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    G3DSA:3.30.530.20 - G3DSA:3.30.530.20 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.256   


    G3DSA:3.30.530.30 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.300   


    G3DSA:3.30.530.40 - no description (Gene3D link)

    Proteins where this domain is known:
    PFC0360w   


    G3DSA:3.30.538.10 - G3DSA:3.30.538.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.148    PF11_0416    PF13_0233    PFE0175c    PFF0675c    PFL1435c   


    G3DSA:3.30.559.10 - G3DSA:3.30.559.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0407    PF13_0121    PFC0170c   


    G3DSA:3.30.56.20 - B5 (Gene3D link)

    Interpro entry IPR005147 : tRNA synthetase, B5 (Interpro link)

    Interpro description:

    Domain B5 is found in phenylalanine-tRNA synthetase beta subunits. This domain has been shown to bind DNA through a winged helix-turn-helix motif. Phenylalanine-tRNA synthetase may influence common cellular processes via DNA binding, in addition to its aminoacylation function.

    Proteins where this domain is known:
    PF11_0051   


    G3DSA:3.30.56.30 - SRP19 (Gene3D link)

    Interpro entry IPR002778 : Signal recognition particle, SRP19 subunit (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA.

    Proteins where this domain is known:
    PFL0785c   


    G3DSA:3.30.565.10 - ATP_bd_ATPase (Gene3D link)

    Interpro entry IPR003594 : ATP-binding region, ATPase-like (Interpro link)

    Interpro description:

    This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    MAL13P1.328    MAL7P1.145    PF07_0029    PF11_0184    PF11_0188    PF14_0316    PF14_0417    PFL1070c    PFL1915w   


    G3DSA:3.30.572.10 - Thymidylat_synth_C (Gene3D link)

    Interpro entry IPR000398 : Thymidylate synthase, C-terminal (Interpro link)

    Interpro description:
    Thymidylate synthase catalyzes the reductive methylation of dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate to dihydrofolate:
     5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMP 
    This provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.

    Proteins where this domain is known:
    PFD0830w   


    G3DSA:3.30.590.10 - ATP-gua_Ptrans (Gene3D link)

    Interpro entry IPR014746 : Glutamine synthetase/guanido kinase, catalytic region (Interpro link)

    Interpro description:

    The C-terminal catalytic domains of glutamine synthetase and the guanido kinase family (which includes creatine kinase and arginine kinase) share a common structural fold, namely a common core consisting of two beta-alpha-beta2-alpha repeats.

    Glutamine synthetase (GS) plays an essential role in the metabolism of nitrogen by catalysing the condensation of glutamate and ammonia to form glutamine. There seem to be three different classes of GS. Class I enzymes (GSI) are specific to prokaryotes, and are oligomers of 12 identical subunits; the activity of GSI-type enzyme is controlled by the adenylation of a tyrosine residue. Class II enzymes (GSII) are found in eukaryotes and in bacteria, and are oligomers of 8 identical subunits. Class III enzymes (GSIII) have been found in Bacteroides fragilis in Butyrivibrio fibrisolvens, and are oligomers of six identical subunits. While the three classes of GS's are clearly structurally related, the sequence similarities are not so extensive.

    ATP:guanido phosphotransferases are a family of structurally and functionally related enzymes that reversibly catalyse the transfer of phosphate between ATP and various phosphogens. The enzymes belonging to this family include:

    Proteins where this domain is known:
    PFI1110w   


    G3DSA:3.30.70.1010 - Transl_elong_EF1_G_con (Gene3D link)

    Interpro entry IPR001662 : Translation elongation factor EF1B, gamma chain, conserved (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    This entry represents a conserved domain usually found near the C-terminus of EF1B-gamma chains, a peptide of 410-440 residues. The gamma chain appears to play a role in anchoring the EF1B complex to the beta and delta chains and to other cellular components.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PF13_0214   


    G3DSA:3.30.70.1130 - G3DSA:3.30.70.1130 (Gene3D link)

    Proteins where this domain is known:
    PF07_0117   


    G3DSA:3.30.70.1170 - G3DSA:3.30.70.1170 (Gene3D link)

    Proteins where this domain is known:
    PFL1475w   


    G3DSA:3.30.70.1220 - Nuc_excision_repair_TFIIH_TTDA (Gene3D link)

    Interpro entry IPR009400 : Nucleotide excision repair, TFIIH, subunit TTDA (Interpro link)

    Interpro description:

    This entry represents nucleotide excision repair (NER) proteins, such as TTDA subunit of TFIIH basal transcription factor complex (also known as subunit 5 of RNA polymerase II transcription factor B), and Rex1. These proteins have a structural motif consisting of a 2-layer sandwich structure with an alpha/beta plait topology. Nucleotide excision repair is a major pathway for repairing UV light-induced DNA damage in most organisms.

    Transcription/repair factor IIH (TFIIH) is essential for RNA polymerase II transcription and nucleotide excision repair. The TFIIH complex consists of ten subunits: ERCC2, ERCC3, GTF2H1, GTF2H2, GTF2H3, GTF2H4, GTF2H5, MNAT1, CDK7 and CCNH. Defects in GTF2H5 cause the disease trichothiodystrophy (TTD), therefore GTF2H5 (general transcription factor 2H subunit 5) is also known as the TTD group A (TTDA) subunit (and as Tfb5). The TTDA subunit is responsible for the DNA repair function of the complex. TTDA is present both bound to TFIIH, and as a free fraction that shuffles between the cytoplasm and nucleus; induction of NER-type DNA lesions shifts the balance towards TTDA's more stable association with TFIIH. TTDA is also required for the stability of the TFIIH complex and for the presence of normal levels of TFIIH in the cell.

    REX1 (required for excision 1) is required for DNA repair in the single-celled, photosynthetic algae Chlamydomonas reinhardtii, and has homologues in other eukaryotes.

    Proteins where this domain is known:
    PF14_0398   


    G3DSA:3.30.70.1230 - A/G_cyclase (Gene3D link)

    Interpro entry IPR001054 : Adenylyl cyclase class-3/4/guanylyl cyclase (Interpro link)

    Interpro description:

    Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.

    In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.

    Proteins where this domain is known:
    MAL13P1.301    MAL8P1.150    PF11_0395    PF14_0043   


    G3DSA:3.30.70.141 - NDK (Gene3D link)

    Interpro entry IPR001564 : Nucleoside diphosphate kinase, core (Interpro link)

    Interpro description:

    Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.

    In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.

    NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.

    NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.

    Proteins where this domain is known:
    PF13_0349    PFF0275c   


    G3DSA:3.30.70.240 - Transl_elong_EFG/EF2_C (Gene3D link)

    Interpro entry IPR000640 : Translation elongation factor EFG/EF2, C-terminal (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF2 (EF-G) is a G-protein. It brings about the translocation of peptidyl-tRNA and mRNA through a ratchet-like mechanism: the binding of GTP-EF2 to the ribosome causes a counter-clockwise rotation in the small ribosomal subunit; the hydrolysis of GTP to GDP by EF2 and the subsequent release of EF2 causes a clockwise rotation of the small subunit back to the starting position. This twisting action destabilises tRNA-ribosome interactions, freeing the tRNA to translocate along the ribosome upon GTP-hydrolysis by EF2. EF2 binding also affects the entry and exit channel openings for the mRNA, widening it when bound to enable the mRNA to translocate along the ribosome.

    This entry represents the C-terminal domain found in EF2 (or EF-G) of both prokaryotes and eukaryotes (also known as eEF2), as well as in some tetracycline-resistance proteins. This domain adopts a ferredoxin-like fold consisting of an alpha/beta sandwich with anti-parallel beta-sheets. It resembles the topology of domain III found in these elongation factors, with which it forms the C-terminal block, but these two domains cannot be superimposed. This domain is often found associated with, which contains the signatures for the N-terminus of the proteins.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    MAL13P1.243    PF07_0062    PF10_0041    PFF0115c    PFI0570w    PFL1590c    PFL1710c   


    G3DSA:3.30.70.270 - G3DSA:3.30.70.270 (Gene3D link)

    Proteins where this domain is known:
    PFI0510c   


    G3DSA:3.30.70.330 - a_b_plait_nuc_bd (Gene3D link)

    Interpro entry IPR012677 : Nucleotide-binding, alpha-beta plait (Interpro link)

    Interpro description:

    This entry represents nucleotide-binding domains with an alpha-beta plait structure, which consists of either a ferredoxin-like (beta-alpha-beta)2 fold, such as that found in RNA-binding domains of various ribonucleoproteins or in viral DNA-binding domains; or a beta-(alpha)-beta-alpha-beta(2) fold, such as that found in the ribosomal protein L23.

    Proteins where this domain is known:
    MAL13P1.120    MAL13P1.303    MAL13P1.338    MAL13P1.35    MAL7P1.126    MAL7P1.157a    MAL8P1.40    MAL8P1.83    PF07_0066    PF08_0086    PF10_0028    PF10_0047    PF10_0068    PF10_0194    PF10_0214    PF10_0217    PF10_0235    PF10_0279    PF11_0083    PF11_0111    PF11_0200    PF11_0205    PF11_0279    PF11_0330    PF11_0402    PF13_0122    PF13_0132    PF13_0147    PF13_0158    PF13_0165    PF13_0315    PF13_0318    PF14_0028    PF14_0056    PF14_0057    PF14_0096    PF14_0194    PF14_0433    PF14_0513    PF14_0656    PFB0255w    PFC0865w    PFD0700c    PFD0750w    PFD0775c    PFE0160c    PFE0750c    PFE0865c    PFE0885w    PFE0975c    PFF0300w    PFF0320c    PFF0505c    PFF0760w    PFF1125c    PFF1425w    PFI0820c    PFI1025w    PFI1175c    PFI1435w    PFI1600w    PFI1695c    PFL0375w    PFL0830w    PFL1170w    PFL1200c    PFL1705w    PFL1745c    PFL1895w    PFL2130w    PFL2310w   


    G3DSA:3.30.70.370 - G3DSA:3.30.70.370 (Gene3D link)

    Proteins where this domain is known:
    PF11_0264   


    G3DSA:3.30.70.380 - Fdx_AntiC_bd (Gene3D link)

    Interpro entry IPR005121 : Phenylalanyl-tRNA synthetase, beta subunit, ferrodoxin-fold anticodon-binding (Interpro link)

    Interpro description:

    This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).

    Proteins where this domain is known:
    PFF0180w   


    G3DSA:3.30.70.580 - PseudoU_synth_1 (Gene3D link)

    Interpro entry IPR001406 : tRNA pseudouridine synthase (Interpro link)

    Interpro description:
    Transfer RNA-pseudouridine synthetase contains one atom of zinc essential for its native conformation and tRNA recognition and has a strictly conserved aspartic acid that is likely to be involved in catalysis. It is involved in the formation of pseudouridine at positions 38, 39 and 40 in the anticodon stem and loop of transfer-RNAs. Pseudouridine is the most abundant modified nucleoside found in all cellular RNAs.

    Proteins where this domain is known:
    PF08_0123    PFE0815w    PFI0420c   


    G3DSA:3.30.70.590 - PolA_pol_RNA-bd (Gene3D link)

    Interpro entry IPR007010 : Poly(A) polymerase, RNA-binding region (Interpro link)

    Interpro description:

    In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA. The crystal structure of bovine poly(A) polymerase bound to an ATP analogue at 2.5 A resolution has been determined. The structure revealed expected and unexpected similarities to other proteins. As expected, the catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase.

    The C-terminal domain unexpectedly folds into a compact domain reminiscent of the RNA-recognition motif fold. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.

    Proteins where this domain is known:
    PFF1240w   


    G3DSA:3.30.70.60 - Transl_elong_EF1B/rib_con (Gene3D link)

    Interpro entry IPR014717 : (Interpro link)

    Interpro description:

    An alpha+beta sandwich domain with a Ferredoxin-like fold can be found in the beta chain of the translation elongation factor EF1B, and in the ribosomal protein S6 from the small subunit.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    Proteins where this domain is known:
    PF14_0606    PFC0870w    PFI0645w    PFI1585c   


    G3DSA:3.30.70.600 - Ribosomal_S10 (Gene3D link)

    Interpro entry IPR001848 : Ribosomal protein S10 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.

    Proteins where this domain is known:
    PF10_0038    PF14_0581   


    G3DSA:3.30.70.660 - PseudoU_synth_1 (Gene3D link)

    Interpro entry IPR001406 : tRNA pseudouridine synthase (Interpro link)

    Interpro description:
    Transfer RNA-pseudouridine synthetase contains one atom of zinc essential for its native conformation and tRNA recognition and has a strictly conserved aspartic acid that is likely to be involved in catalysis. It is involved in the formation of pseudouridine at positions 38, 39 and 40 in the anticodon stem and loop of transfer-RNAs. Pseudouridine is the most abundant modified nucleoside found in all cellular RNAs.

    Proteins where this domain is known:
    PF08_0123    PFE0815w    PFI0420c   


    G3DSA:3.30.70.830 - Ion_tolerance_CutA1 (Gene3D link)

    Interpro entry IPR004323 : Divalent ion tolerance protein, CutA1 (Interpro link)

    Interpro description:

    CutA1 is a widespread protein of about 12 kDa found in bacteria, plants, and animals, including humans. The protein was originally identified in a gene locus of Escherichia coli called cutA involved in divalent metal toleranc. The cutA locus consists of two operons, one containing a single gene encoding a cytoplasmic protein, CutA1, and the other composed of two genes encoding a 50-kDa (CutA2) and a 24-kDa (CutA3) inner membrane proteins. Molecular genetics studies on the E. coli cutA locus showed that some mutations lead to copper sensitivity due to its increased uptake. However, the specific function of CutA1 in E. coli is still unknown.

    However, a possible role of mammalian CutA1 in the anchoring of the enzyme acetylcholinesterase (AChE)1 in neuronal cell membranes. CutA1 does not directly interact with AChE, but the CutA1 gene is widely expressed in different regions of the brain with an expression pattern that parallels that of AChE. In addition CutA1 Co-purified with AChE from human caudate nucleus. CutA1, thus, might provide an intriguing link between copper tolerance in bacteria and a complex process in the brain of the most evolved organisms.

    Both rat and E. coli CutA1 have been crystallised. Both proteins are trimeric in the crystals and in solution through an inter-subunit beta-sheet formation. Each monomer exhibits the same overall structure, adopting a ferredoxin-like fold made of an alpha-beta sandwich with antiparallel beta-sheet and containing an additional short strand and a C-terminal helix. In the beta-sheet, alternate strands are connected by helices with positive crossovers, resulting in a double beta-alpha-beta motif where the antiparallel beta-sheet packs against antiparallel alpha-helices. The C-terminal helix packs orthogonal to the N terminus.

    The strong structure similarity of CutA1 with PII proteins might point to an role for CutA1 in signalling through allosteric communication between monomers. CutA1 may be involved in the tuning of a disulphide bond cascade in bacteria and mammals, acting as the PII proteins do in the nitrogen signal cascade in bacteria and plants.

    Proteins where this domain is known:
    PFL2375c   


    G3DSA:3.30.70.870 - G3DSA:3.30.70.870 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.243    PF14_0486    PFF0115c    PFL1590c    PFL1710c   


    G3DSA:3.30.700.20 - G3DSA:3.30.700.20 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.172   


    G3DSA:3.30.710.10 - BTB/POZ_fold (Gene3D link)

    Interpro entry IPR011333 : BTB/POZ fold (Interpro link)

    Interpro description:

    The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is a versatile protein-protein interaction motif involved in many cellular functions, including transcriptional regulation, cytoskeleton dynamics, ion channel assembly and gating, and targeting proteins for ubiquitination. The BTB domain can occur alongside other domains: BTB-zinc finger (BTB-ZF), BTB-BACK-Kelch (BBK), voltage-gated potassium channel T1 (T1-Kv), MATH-BTB, BTB-NPH3 and BTB-BACK-PHR (BBP). Other proteins, such as Skp1 and ElonginC, consist almost exclusively of the core BTB fold. In all of these protein families, the BTB core fold is structurally conserved, consisting of a 2-layer alpha/beta topology where a cluster of alpha helices is flanked by short beta-sheets. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.

    This entry differs from IPR000210 in including POZ-containing Skp1 proteins.

    Proteins where this domain is known:
    MAL13P1.337    PFL1875w   


    G3DSA:3.30.720.10 - Signal_recog_particle_SRP9/14 (Gene3D link)

    Interpro entry IPR009018 : Signal recognition particle, SRP9/SRP14 subunit (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents both the 9 kDa SRP9 and the 14 kDa SRP14 components. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.

    Proteins where this domain is known:
    MAL7P1.158    PFL0160w   


    G3DSA:3.30.740.10 - Dynein_light1 (Gene3D link)

    Interpro entry IPR001372 : Dynein light chain, type 1 and 2 (Interpro link)

    Interpro description:

    Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.

    Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.

    Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).

    Proteins where this domain is known:
    MAL7P1.161    PF13_0306    PFL0660w   


    G3DSA:3.30.760.10 - TIF_eIF_4E (Gene3D link)

    Interpro entry IPR001040 : Eukaryotic translation initiation factor 4E (eIF-4E) (Interpro link)

    Interpro description:
    Eukaryotic translation initiation factor 4E (eIF-4E) is a protein that binds to the cap structure of eukaryotic cellular mRNAs. eIF-4E recognises and binds the 7-methylguanosine-containing (m7Gppp) cap during an early step in the initiation of protein synthesis and facilitates ribosome binding to a mRNA by inducing the unwinding of its secondary structures. A tryptophan in the central part of the sequence of human eIF-4E seems to be implicated in cap-binding.

    Proteins where this domain is known:
    PFA0570w    PFC0635c   


    G3DSA:3.30.780.10 - G3DSA:3.30.780.10 (Gene3D link)

    Proteins where this domain is known:
    PF08_0079    PFL2095w   


    G3DSA:3.30.800.10 - G3DSA:3.30.800.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0123    PFA0515w   


    G3DSA:3.30.810.10 - G3DSA:3.30.810.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0123    PFA0515w   


    G3DSA:3.30.830.10 - Pept_M16_core (Gene3D link)

    Interpro entry IPR011237 : Peptidase M16, core (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    The majority of the sequences in this entry are metallopeptidases and non-peptidase homologs belong to MEROPS peptidase family M16 (clan ME), subfamilies M16A, M16B and M16C; they include:

    These proteins do not share many regions of sequence similarity; the most noticeable is in the N-terminal section. This region includes a conserved histidine followed, two residues later by a glutamate and another histidine. In pitrilysin, it has been shown that this H-x-x-E-H motif is involved in enzymatic activity; the two histidines bind zinc and the glutamate is necessary for catalytic activity. The proteins classified as non-peptidase homologues either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity.

    Proteins where this domain is known:
    PF11_0189    PF11_0226    PF13_0322    PF14_0382    PFE1155c    PFI1625c   


    G3DSA:3.30.860.10 - Ribosomal_S19 (Gene3D link)

    Interpro entry IPR002222 : Ribosomal protein S19/S15 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).

    Proteins where this domain is known:
    MAL13P1.92   


    G3DSA:3.30.870.10 - G3DSA:3.30.870.10 (Gene3D link)

    Proteins where this domain is known:
    PFF0465c   


    G3DSA:3.30.900.10 - G3DSA:3.30.900.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0227   


    G3DSA:3.30.930.10 - G3DSA:3.30.930.10 (Gene3D link)

    Proteins where this domain is known:
    PF07_0073    PF10_0409    PF11_0051    PF11_0270    PF13_0262    PF14_0166    PF14_0198    PF14_0428    PF14_0573    PFA0145c    PFA0480w    PFB0525w    PFE0475w    PFE0715w    PFF0180w    PFI1240c    PFI1645c    PFL0670c    PFL0770w    PFL1540c   


    G3DSA:3.30.950.10 - 4pyrrole_Mease_sub2 (Gene3D link)

    Interpro entry IPR014776 : Tetrapyrrole methylase, subdomain 2 (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    This entry represents the C-terminal subdomain 2 from several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:

    Proteins where this domain is known:
    PF10_0087   


    G3DSA:3.30.960.10 - G3DSA:3.30.960.10 (Gene3D link)

    Proteins where this domain is known:
    PFB0550w   


    G3DSA:3.30.980.10 - G3DSA:3.30.980.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0270   


    G3DSA:3.40.1000.10 - Mog1/PsbP_a/b/a-sand (Gene3D link)

    Interpro entry IPR016123 : (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of a 3-layer alpha/beta/alpha fold. The beta layer is composed of seven beta-sheets, and the overall order is: (beta-hairpin)-beta(3)-alpha-beta(4)-alpha. Domains with this structure are found in the following protein families:

    Proteins where this domain is known:
    MAL13P1.232   


    G3DSA:3.40.1010.10 - 4pyrrole_Mease_sub1 (Gene3D link)

    Interpro entry IPR014777 : Tetrapyrrole methylase, subdomain 1 (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    This entry represents the N-terminal subdomain 1 from several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:

    Proteins where this domain is known:
    PF10_0087   


    G3DSA:3.40.1060.10 - Aconitase/IPMdHydase_lsu_aba_2 (Gene3D link)

    Interpro entry IPR015932 : Aconitase/3-isopropylmalate dehydratase large subunit, alpha/beta/alpha, subdomain 2 (Interpro link)

    Interpro description:

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents a domain with an alpha/beta/alpha topology. This structural domain usually occurs in triplicate, with domains 1 and 3 being the most closely related since they share the same pseudo 2-fold symmetry. This entry represents domain 2. This triple domain region is found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB; in each case, this region binds the [4Fe-4S]-cluster. This triple domain region is also found in the large subunit of isopropylmalate dehydratase (LeuC).

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    G3DSA:3.40.1090.10 - G3DSA:3.40.1090.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.285    PFB0410c   


    G3DSA:3.40.1110.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PFA0310c    PFL0590c   


    G3DSA:3.40.1120.10 - Ribosomal_L15e (Gene3D link)

    Interpro entry IPR000439 : Ribosomal protein L15e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:

  • Mammalian L15.
  • Insect L15.
  • Plant L15.
  • Yeast YL10 (L13) (Rp15r).
  • Archaebacterial L15e.
  • These proteins have about 200 amino acid residues.

    Proteins where this domain is known:
    PFD0770c   


    G3DSA:3.40.1130.10 - G3DSA:3.40.1130.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0100   


    G3DSA:3.40.1170.10 - DNA_mismatch_repair_MutS_N (Gene3D link)

    Interpro entry IPR016151 : DNA mismatch repair protein MutS, N-terminal (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed beta-sheet surrounded by three alpha-helices, which is similar to the structure of tRNA endonuclease.

    Proteins where this domain is known:
    PF14_0051    PFE0270c   


    G3DSA:3.40.1180.10 - UPP_synth (Gene3D link)

    Interpro entry IPR001441 : Di-trans-poly-cis-decaprenylcistransferase-like (Interpro link)

    Interpro description:

    Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetase

    Di-trans-poly-cis-decaprenylcistransferase (UPP synthetase) generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate (IPP). This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.

    This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase.

    Proteins where this domain is known:
    MAL8P1.22   


    G3DSA:3.40.1190.10 - Mur_ligase_cen (Gene3D link)

    Interpro entry IPR013221 : Mur ligase, central (Interpro link)

    Interpro description:

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.

    This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).

    Proteins where this domain is known:
    PF13_0140   


    G3DSA:3.40.1190.20 - G3DSA:3.40.1190.20 (Gene3D link)

    Proteins where this domain is known:
    PF11_0453    PFE1030c    PFF0775w    PFL1920c   


    G3DSA:3.40.120.10 - A-D-PHexomutase_a/b/a-I/II/III (Gene3D link)

    Interpro entry IPR016055 : Alpha-D-phosphohexomutase, alpha/beta/alpha I, II and III (Interpro link)

    Interpro description:

    The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.

    Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.

    The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.

    This entry represents domains I, II and III found in alpha-D-phosphohexomutase enzymes. All three domains share a 3-layer alpha/beta/alpha topology.

    Proteins where this domain is known:
    PF10_0122    PF11_0311   


    G3DSA:3.40.1280.10 - G3DSA:3.40.1280.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0300    PF14_0273    PF14_0307    PFB0855c    PFE1275c    PFF1085c   


    G3DSA:3.40.1340.10 - RNA_pol_Rpb5_N (Gene3D link)

    Interpro entry IPR005571 : RNA polymerase, Rpb5, N-terminal (Interpro link)

    Interpro description:

    Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).

    This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA.

    Proteins where this domain is known:
    PF13_0341   


    G3DSA:3.40.1350.10 - Endonuc_TnsA/Hjc/tRNA (Gene3D link)

    Interpro entry IPR011856 : Endonuclease TnsA, N-terminal/resolvase Hjc/tRNA endonuclease, C-terminal (Interpro link)

    Interpro description:

    This entry represents a structural motif found in three types of endonucleases: TsnA endonuclease (N-terminal), Hjc-type resolvase, and tRNA-intron endonuclease (C-terminal). These domains have a 3-layer alpha/beta/alpha topology, which is similar in structure to a motif found in several restriction endonucleases.

    TsnA endonuclease is a catalytic component of the Tn7 transposition system. Tn7 transposase is composed of four proteins: TnsA, TnsB, TnsC and TsnD. DNA breakage at the 5' end of the transposon is carried out by TnsA, and breakage and joining at the 3' end is carried out by TnsB. TnsC is the molecular switch that regulates transposition. The N-terminal domain of TnsA is catalytic.

    Hjc is a type of Holliday junction resolvase. The Holliday junction is an essential intermediate of homologous recombination, comprising four-stranded DNA complexes that are formed during recombination and related DNA repair events. During homologous recombination, genetic information is physically exchanged between parental DNAs via crossing single strands of the same polarity within the four-way Holliday structure. Hjc is an archaeal endonuclease, which specifically resolves the junction DNA to produce two separate recombinant DNA duplexes. This process is terminated by the endonucleolytic activity of resolvases, which convert the four-way DNA back to two double strands.

    tRNA-intron endonucleases cleave pre-tRNA producing 5'-hydroxyl and 2',3'-cyclic phosphate termini, and specifically removing the intron. The splicing of transfer RNA precursors is similar in Eucarya and Archaea. In both kingdoms an endonuclease recognises the splice sites and releases the intron, but the mechanism of splice site recognition is different in each kingdom.

    Proteins where this domain is known:
    PF14_0514    PFL2300w   


    G3DSA:3.40.1360.10 - G3DSA:3.40.1360.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0412    PFL0825c   


    G3DSA:3.40.1370.10 - G3DSA:3.40.1370.10 (Gene3D link)

    Proteins where this domain is known:
    PF08_0038    PFE0350c   


    G3DSA:3.40.1380.10 - G3DSA:3.40.1380.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0061   


    G3DSA:3.40.1380.20 - Pyrv_Knase_a/b (Gene3D link)

    Interpro entry IPR015794 : Pyruvate kinase, alpha/beta (Interpro link)

    Interpro description:

    Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

     ADP + phosphoenolpyruvate = ATP + pyruvate 

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    This entry represents the 3-layer alpha/beta/alpha sandwich domain.

    Proteins where this domain is known:
    PF10_0363    PFF1300w   


    G3DSA:3.40.140.10 - G3DSA:3.40.140.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0259    PFL0230w   


    G3DSA:3.40.1490.10 - G3DSA:3.40.1490.10 (Gene3D link)

    Proteins where this domain is known:
    PFD0355c    PFF0515c   


    G3DSA:3.40.1500.10 - Coprogen_oxidas (Gene3D link)

    Interpro entry IPR001260 : Coproporphyrinogen III oxidase (Interpro link)

    Interpro description:
    Coprogen oxidase (i.e. coproporphyrin III oxidase or coproporphyrinogenase) catalyses the oxidative decarboxylation of coproporphyrinogen III to proto-porhyrinogen IX in the haem and chlorophyll biosynthetic pathways. The protein is a homodimer containing two internally bound iron atoms per molecule of native protein . The enzyme is active in the presence of molecular oxygen that acts as an electron acceptor). The enzyme is widely distributed having been found in a variety of eukaryotic and prokaryotic sources.

    Proteins where this domain is known:
    PF11_0436   


    G3DSA:3.40.190.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PFL0480w   


    G3DSA:3.40.192.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF14_0164    PF14_0286   


    G3DSA:3.40.20.10 - G3DSA:3.40.20.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0324    PF13_0326    PFD0250c    PFE0165w   


    G3DSA:3.40.220.10 - G3DSA:3.40.220.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.74    MAL7P1.83    PF14_0466   


    G3DSA:3.40.250.10 - Rhodanese-like (Gene3D link)

    Interpro entry IPR001763 : (Interpro link)

    Interpro description:

    Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including

    Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.

    Proteins where this domain is known:
    PF13_0027    PFL0320w   


    G3DSA:3.40.30.10 - Thioredoxin_fold (Gene3D link)

    Interpro entry IPR012335 : (Interpro link)

    Interpro description:

    Several biological processes regulate the activity of target proteins through changes in the redox state of thiol groups (S2 to SH2), where a hydrogen donor is linked to an intermediary disulphide protein. Such processes include the ferredoxin/thioredoxin system, the NADP/thioredoxin system, and the glutathione/glutaredoxin system. Several of these disulphide proteins share a common structure, consisting of a three-layer alpha/beta/alpha core. Proteins that contain domains with a thioredoxin fold include:

    Proteins where this domain is known:
    MAL13P1.100    MAL13P1.225    MAL7P1.159    MAL7P1.88    MAL8P1.17    PF07_0034    PF07_0036    PF08_0032    PF08_0131    PF10_0066    PF10_0134    PF10_0268    PF11_0055    PF11_0099    PF11_0286    PF11_0352    PF13_0214    PF13_0272    PF14_0186    PF14_0187    PF14_0368    PF14_0545    PF14_0590    PF14_0694    PFC0166w    PFC0205c    PFC0271c    PFE0820c    PFF0340c    PFI0790w    PFI0945w    PFI0950w    PFI1250w    PFL0595c    PFL0725w    PFL1520w   


    G3DSA:3.40.366.10 - Ac_transferase_reg (Gene3D link)

    Interpro entry IPR001227 : Acyl transferase region (Interpro link)

    Interpro description:
    Enzymes like bacterial malonyl CoA-acly carrier protein transacylase and eukaryotic fatty acid synthase that are involved in fatty acid biosynthesis belong to this group. Also included are the polyketide synthases 6-methylsalicylic acid synthase, a multifunctional enzyme that involved in the biosynthesis of patulin and conidial green pigment synthase.

    Proteins where this domain is known:
    PF13_0066   


    G3DSA:3.40.367.20 - G3DSA:3.40.367.20 (Gene3D link)

    Proteins where this domain is known:
    PFF1155w   


    G3DSA:3.40.390.10 - G3DSA:3.40.390.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.184    PF10_0058   


    G3DSA:3.40.390.30 - G3DSA:3.40.390.30 (Gene3D link)

    Proteins where this domain is known:
    PFD0980w   


    G3DSA:3.40.430.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PFD0830w   


    G3DSA:3.40.440.10 - G3DSA:3.40.440.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0287   


    G3DSA:3.40.449.10 - PEP_carboxykinase_N (Gene3D link)

    Interpro entry IPR008210 : Phosphoenolpyruvate carboxykinase, N-terminal (Interpro link)

    Interpro description:

    Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.

    PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.

    PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.

    This entry represents the N-terminal domain found in both GTP-utilising and ATP-utilising phosphoenolpyruvate carboxykinase enzymes.

    Proteins where this domain is known:
    PF13_0234   


    G3DSA:3.40.47.10 - Thiolase-like_subgr (Gene3D link)

    Interpro entry IPR016038 : Thiolase-like, subgroup (Interpro link)

    Interpro description:

    This entry represents a subgroup of thiolase-like domains (missing a few subfamilies). These domains have a 3-layer structure with an alpha/beta/alpha topology. This domain usually occurs in two similar copies that are related by a pseudo-dyad, and which arose through duplication. The proteins in this entry can be split into two groups: those related to thiolase, and those related to chalcone synthase. The thiolase-like enzymes include:

    The chalcone synthase-like enzymes include:

    Proteins where this domain is known:
    PF14_0484    PFB0505c    PFF1275c   


    G3DSA:3.40.470.10 - Uracil-DNA_glycosylase-like (Gene3D link)

    Interpro entry IPR005122 : (Interpro link)

    Interpro description:

    This entry represents various uracil-DNA glycosylases and related DNA glycosylases, such as uracil-DNA glycosylase, thermophilic uracil-DNA glycosylase, G:T/U mismatch-specific DNA glycosylase (Mug), and single-strand selective monofunctional uracil-DNA glycosylase (SMUG1). These proteins have a 3-layer alpha/beta/alpha structure. Uracil-DNA glycosylases are DNA repair enzymes that excise uracil residues from DNA by cleaving the N-glycosylic bond, initiating the base excision repair pathway. Uracil in DNA can arise either through the deamination of cytosine to form mutagenic U:G mispairs, or through the incorporation of dUMP by DNA polymerase to form U:A pairs. These aberrant uracil residues are genotoxic. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization, but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.

    Proteins where this domain is known:
    PF14_0148   


    G3DSA:3.40.5.10 - Ribosomal_L9 (Gene3D link)

    Interpro entry IPR000244 : Ribosomal protein L9 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities.

    The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker. Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends.

    Proteins where this domain is known:
    MAL13P1.318   


    G3DSA:3.40.50.1000 - G3DSA:3.40.50.1000 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.275    MAL13P1.301    PF07_0059    PF07_0110    PF07_0115    PF10_0124    PF10_0325    PF11_0190    PF11_0395    PF14_0654    PFC0840w    PFE0195w    PFE0795c    PFE0805w    PFI0240c    PFL0950c    PFL1125w    PFL1260w    PFL1270w   


    G3DSA:3.40.50.10050 - G3DSA:3.40.50.10050 (Gene3D link)

    Proteins where this domain is known:
    PFF0345w   


    G3DSA:3.40.50.1010 - G3DSA:3.40.50.1010 (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.67    PF07_0105    PF10_0040    PF10_0080    PFB0180w    PFB0265c    PFD0420c   


    G3DSA:3.40.50.10130 - DNA_repair_nuc_XPF/helicase (Gene3D link)

    Interpro entry IPR006166 : DNA repair nuclease, XPF-type/Helicase (Interpro link)

    Interpro description:

    This entry represents a structural motif found in several DNA repair nucleases, such as Rad1/Mus81/XPF endonucleases, and in ATP-dependent helicases. The XPF/Rad1/Mus81-dependent nuclease family specifically cleaves branched structures generated during DNA repair, replication, and recombination, and is essential for maintaining genome stability. The nuclease domain architecture exhibits remarkable similarity to those of restriction endonucleases.

    Proteins where this domain is known:
    MAL13P1.346   


    G3DSA:3.40.50.10190 - G3DSA:3.40.50.10190 (Gene3D link)

    Proteins where this domain is known:
    PFB0895c   


    G3DSA:3.40.50.10240 - TPK_catalytic (Gene3D link)

    Interpro entry IPR007371 : Thiamin pyrophosphokinase, catalytic region (Interpro link)

    Interpro description:
    Thiamin pyrophosphokinase (TPK) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamin) to form the coenzyme thiamin pyrophosphate (TPP). Thus, TPK is important for the formation of a coenzyme required for central metabolic functions. The structure of thiamin pyrophosphokinase suggests that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis.

    Proteins where this domain is known:
    PFI1195c   


    G3DSA:3.40.50.10260 - G3DSA:3.40.50.10260 (Gene3D link)

    Proteins where this domain is known:
    PF14_0570   


    G3DSA:3.40.50.10300 - DNA/pantothenate-metab_flavo_C (Gene3D link)

    Interpro entry IPR007085 : (Interpro link)

    Interpro description:

    This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC. This domain adopts a 3-layer alpha/beta/alpha fold with mixed beta-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.

    Proteins where this domain is known:
    PF11_0036    PFD0610w   


    G3DSA:3.40.50.10320 - G3DSA:3.40.50.10320 (Gene3D link)

    Proteins where this domain is known:
    PFF1190c   


    G3DSA:3.40.50.10420 - FTHF_cligase (Gene3D link)

    Interpro entry IPR002698 : 5-formyltetrahydrofolate cyclo-ligase (Interpro link)

    Interpro description:
    5-formyltetrahydrofolate cyclo-ligase or methenyl-THF synthetasecatalyses the interchange of 5-formyltetrahydrofolate (5-FTHF) to 5-10-methenyltetrahydrofolate, this requires ATP and Mg2+. 5-FTHF is used in chemotherapy where it is clinically known as Leucovorin.

    Proteins where this domain is known:
    PFL2160c   


    G3DSA:3.40.50.10470 - G3DSA:3.40.50.10470 (Gene3D link)

    Proteins where this domain is known:
    PF08_0009    PF10_0136    PFL2430c   


    G3DSA:3.40.50.10480 - G3DSA:3.40.50.10480 (Gene3D link)

    Proteins where this domain is known:
    PF08_0055   


    G3DSA:3.40.50.10490 - G3DSA:3.40.50.10490 (Gene3D link)

    Proteins where this domain is known:
    PF10_0245    PF14_0341   


    G3DSA:3.40.50.1220 - G3DSA:3.40.50.1220 (Gene3D link)

    Proteins where this domain is known:
    PF13_0152    PF14_0489    PF14_0508   


    G3DSA:3.40.50.1240 - G3DSA:3.40.50.1240 (Gene3D link)

    Proteins where this domain is known:
    PF11_0208    PF14_0282    PFB0380c    PFD0660w   


    G3DSA:3.40.50.1260 - Phosphoglycerate_kinase_N (Gene3D link)

    Interpro entry IPR015824 : Phosphoglycerate kinase, N-terminal (Interpro link)

    Interpro description:

    Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.

    PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.

    Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.

    This entry represents the N-terminal domain of PGK.

    Proteins where this domain is known:
    PFI1105w   


    G3DSA:3.40.50.1270 - Phosphoglycerate_kinase_C (Gene3D link)

    Interpro entry IPR015901 : Phosphoglycerate kinase, C-terminal (Interpro link)

    Interpro description:

    Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.

    PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.

    Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.

    This entry represents the C-terminal domain of PGK.

    Proteins where this domain is known:
    PFI1105w   


    G3DSA:3.40.50.1360 - no description (Gene3D link)

    Proteins where this domain is known:
    PF14_0511    PFE0730c   


    G3DSA:3.40.50.1370 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.221   


    G3DSA:3.40.50.1380 - G3DSA:3.40.50.1380 (Gene3D link)

    Proteins where this domain is known:
    PF13_0044   


    G3DSA:3.40.50.140 - G3DSA:3.40.50.140 (Gene3D link)

    Proteins where this domain is known:
    PF13_0251   


    G3DSA:3.40.50.1400 - G3DSA:3.40.50.1400 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.326   


    G3DSA:3.40.50.1440 - Tubulin_FtsZ (Gene3D link)

    Interpro entry IPR003008 : Tubulin/FtsZ, GTPase (Interpro link)

    Interpro description:

    This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea.

    Proteins where this domain is known:
    PF08_0125    PF10_0084    PF14_0725    PFD1050w    PFI0180w    PFI1635w   


    G3DSA:3.40.50.1480 - Ad_hcy_hydrolase (Gene3D link)

    Interpro entry IPR000043 : S-adenosyl-L-homocysteine hydrolase (Interpro link)

    Interpro description:
    S-adenosyl-L-homocysteine hydrolase (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein of about 430 to 470 amino acids. The family contains a glycine-rich region in the central part of AdoHcyase; a region thought to be involved in NAD-binding.

    Proteins where this domain is known:
    PFE1050w   


    G3DSA:3.40.50.150 - G3DSA:3.40.50.150 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.214    MAL13P1.255    MAL13P1.31    MAL7P1.130    MAL7P1.151    PF07_0015    PF07_0020    PF08_0092    PF10_0197    PF10_0274    PF11_0116    PF11_0284    PF11_0301    PF11_0305    PF11_0348    PF13_0016    PF13_0052    PF13_0087    PF13_0236    PF13_0286    PF13_0323    PF14_0068    PF14_0156    PF14_0242    PF14_0309    PF14_0526    PFB0220w    PFD0350w    PFE1115c    PFI0415c    PFI0815c    PFI1235w    PFL1475w    PFL1775c    PFL2395c   


    G3DSA:3.40.50.1580 - no description (Gene3D link)

    Proteins where this domain is known:
    PFE0660c   


    G3DSA:3.40.50.170 - Formyl_transf_N (Gene3D link)

    Interpro entry IPR002376 : Formyl transferase, N-terminal (Interpro link)

    Interpro description:
    A number of formyl transferases belong to this group. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. Formyltetrahydrofolate dehydrogenase produces formate from formyl- tetrahydrofolate. This is the N-terminal domain of these enzymes and is found upstream of the C-terminal domain.

    The trifunctional glycinamide ribonucleotide synthetase-aminoimidazole ribonucleotide synthetase-glycinamide ribonucleotide transformylase catalyses the second, third and fifth steps in de novo purine biosynthesis. The glycinamide ribonucleotide transformylase belongs to this group.

    Proteins where this domain is known:
    MAL13P1.67   


    G3DSA:3.40.50.1760 - Glutathione_synth_subst-bd_euk (Gene3D link)

    Interpro entry IPR004887 : Glutathione synthase, substrate-binding, eukaryotic (Interpro link)

    Interpro description:

    This entry represents the substrate-binding domain of glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system. The substrate-binding domain has a 3-layer alpha/beta/alpha structure.

    Proteins where this domain is known:
    PFE0605c   


    G3DSA:3.40.50.1820 - G3DSA:3.40.50.1820 (Gene3D link)

    Proteins where this domain is known:
    MAL7P1.156    MAL8P1.138    MAL8P1.38    PF07_0005    PF07_0040    PF08_0022    PF10_0379    PF11_0168    PF11_0211    PF11_0276    PF11_0441    PF14_0015    PF14_0017    PF14_0099    PF14_0250    PF14_0395    PF14_0737    PF14_0738    PFA0120c    PFC0065c    PFC0950c    PFD0185c    PFF1420w    PFI1775w    PFI1800w    PFL2530w   


    G3DSA:3.40.50.1910 - G3DSA:3.40.50.1910 (Gene3D link)

    Proteins where this domain is known:
    PF10_0331    PFF0665c   


    G3DSA:3.40.50.1950 - Flavoprotein (Gene3D link)

    Interpro entry IPR003382 : Flavoprotein (Interpro link)

    Interpro description:
    This entry contains a diverse range of flavoprotein enzymes, including epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN. This enzyme catalyzes the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance. Dipicolinate synthase catalyses the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenylacrylic acid decarboxylase

    Proteins where this domain is known:
    MAL8P1.81   


    G3DSA:3.40.50.20 - Pre-ATP_grasp (Gene3D link)

    Interpro entry IPR013817 : Pre-ATP-grasp fold (Interpro link)

    Interpro description:

    The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyze the formation of amide bonds, catalyzing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, amongst others. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.

    This entry represents the pre-ATP-grasp domain, which precedes the ATP-grasp domain in all superfamily members, and which usually occurs at the N-terminus of the protein. The structure of the pre-ATP-grasp domain consists of alpha/beta/alpha in three layers, and is possibly a rudiment form of the Rossmann-fold. This domain can have a substrate-binding function.

    Proteins where this domain is known:
    PF13_0044    PF14_0664   


    G3DSA:3.40.50.200 - Pept_S8_S53 (Gene3D link)

    Interpro entry IPR000209 : Peptidase S8 and S53, subtilisin, kexin, sedolisin (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.

    The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.

    The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.

    Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.

    Proteins where this domain is known:
    PF11_0381    PFE0355c    PFE0370c   


    G3DSA:3.40.50.2020 - no description (Gene3D link)

    Proteins where this domain is known:
    PF10_0121    PF13_0143    PF13_0157    PFE0630c   


    G3DSA:3.40.50.2060 - G3DSA:3.40.50.2060 (Gene3D link)

    Proteins where this domain is known:
    PF10_0331    PFB0750w    PFF0665c   


    G3DSA:3.40.50.261 - Succinyl-CoA_synth-like (Gene3D link)

    Interpro entry IPR016102 : (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of 3-layers, alpha/beta/alpha. This domain is found in both the alpha and beta chains of succinyl-CoA synthase GDP-forming) and(ADP-forming)). This domain can also be found in ATP citrate synthase (), malate-CoA ligase () and acetate-CoA ligase (or acetyl-CoA synthase) (), as well as bacterial Fdr. Some members of the domain utilise ATP others use GTP.

    Proteins where this domain is known:
    PF11_0097    PF14_0295    PF14_0357   


    G3DSA:3.40.50.300 - G3DSA:3.40.50.300 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.13    MAL13P1.164    MAL13P1.166    MAL13P1.205    MAL13P1.241    MAL13P1.243    MAL13P1.262    MAL13P1.294    MAL13P1.297    MAL13P1.344    MAL13P1.51    MAL13P1.96    MAL7P1.113    MAL7P1.12    MAL7P1.122    MAL7P1.201    MAL7P1.206    MAL7P1.209    MAL8P1.144    MAL8P1.19    MAL8P1.33    MAL8P1.53    MAL8P1.65    MAL8P1.75a    MAL8P1.76    MAL8P1.92    MAL8P1.99    PF07_0023    PF07_0047    PF07_0062    PF08_0018    PF08_0048    PF08_0062    PF08_0063    PF08_0078    PF08_0096    PF08_0100    PF08_0110    PF08_0111    PF08_0117    PF08_0126    PF10_0041    PF10_0057    PF10_0081    PF10_0086    PF10_0099    PF10_0203    PF10_0209    PF10_0232    PF10_0309    PF10_0337    PF10_0368    PF10_0369    PF11_0053    PF11_0071    PF11_0077    PF11_0078    PF11_0087    PF11_0117    PF11_0131    PF11_0143    PF11_0175    PF11_0183    PF11_0203    PF11_0225    PF11_0245    PF11_0249    PF11_0296    PF11_0314    PF11_0317    PF11_0405    PF11_0414    PF11_0461    PF11_0465    PF11_0466    PF13_0033    PF13_0037    PF13_0063    PF13_0065    PF13_0069    PF13_0077    PF13_0090    PF13_0095    PF13_0119    PF13_0177    PF13_0218    PF13_0271    PF13_0273    PF13_0291    PF13_0304    PF13_0305    PF13_0308    PF13_0330    PF13_0334    PF13_0350    PF14_0051    PF14_0052    PF14_0063    PF14_0100    PF14_0104    PF14_0112    PF14_0114    PF14_0126    PF14_0133    PF14_0147    PF14_0159    PF14_0177    PF14_0183    PF14_0185    PF14_0221    PF14_0234    PF14_0244    PF14_0254    PF14_0278    PF14_0292    PF14_0321    PF14_0326    PF14_0339    PF14_0345    PF14_0370    PF14_0399    PF14_0400    PF14_0415    PF14_0429    PF14_0436    PF14_0437    PF14_0455    PF14_0477    PF14_0485    PF14_0486    PF14_0548    PF14_0564    PF14_0593    PF14_0599    PF14_0601    PF14_0616    PF14_0655    PFA0180w    PFA0185w    PFA0330w    PFA0335w    PFA0495c    PFA0530c    PFA0545c    PFA0555c    PFA0590w    PFB0445c    PFB0500c    PFB0795w    PFB0840w    PFB0860c    PFB0895c    PFC0125w    PFC0140c    PFC0190c    PFC0260w    PFC0565w    PFC0875w    PFC0915w    PFC0955w    PFD0245c    PFD0305c    PFD0385c    PFD0530c    PFD0565c    PFD0665c    PFD0685c    PFD0710w    PFD0725c    PFD0755c    PFD0790c    PFD0810w    PFD0935c    PFD1060w    PFD1070w    PFE0205w    PFE0215w    PFE0270c    PFE0430w    PFE0450w    PFE0625w    PFE0665c    PFE0690c    PFE0705c    PFE0830c    PFE0925c    PFE1085w    PFE1090w    PFE1150w    PFE1215c    PFE1255w    PFE1345c    PFE1390w    PFE1435c    PFF0100w    PFF0115c    PFF0155w    PFF0285c    PFF0345w    PFF0385c    PFF0625w    PFF0810c    PFF0940c    PFF1185w    PFF1500c    PFI0155c    PFI0165c    PFI0355c    PFI0480w    PFI0525w    PFI0570w    PFI0910w    PFI1005w    PFI1420w    PFI1505c    PFI1550c    PFL0075w    PFL0100c    PFL0150w    PFL0495c    PFL0560c    PFL0580w    PFL0835w    PFL0895c    PFL1310c    PFL1410c    PFL1500w    PFL1590c    PFL1710c    PFL1725w    PFL1925w    PFL2005w    PFL2010c    PFL2245w    PFL2345c    PFL2440w    PFL2465c    PFL2475w   


    G3DSA:3.40.50.360 - G3DSA:3.40.50.360 (Gene3D link)

    Proteins where this domain is known:
    PF14_0478    PFE1240w    PFI1140w   


    G3DSA:3.40.50.410 - G3DSA:3.40.50.410 (Gene3D link)

    Proteins where this domain is known:
    PF08_0036    PF08_0136b    PF13_0201    PF13_0324    PFC0640w    PFD0250c    PFF0800w   


    G3DSA:3.40.50.450 - G3DSA:3.40.50.450 (Gene3D link)

    Proteins where this domain is known:
    PF11_0294    PFD0670c    PFI0755c   


    G3DSA:3.40.50.460 - G3DSA:3.40.50.460 (Gene3D link)

    Proteins where this domain is known:
    PFI0755c   


    G3DSA:3.40.50.620 - Rossmann-like_a/b/a_fold (Gene3D link)

    Interpro entry IPR014729 : (Interpro link)

    Interpro description:

    This entry represents domains related by a common ancestor that have a Rossmann-like, 3-layer, alpha/beta/alpha sandwich fold, as found in the protein families listed below:

    Proteins where this domain is known:
    MAL13P1.281    MAL13P1.86    MAL8P1.125    PF08_0011    PF10_0053    PF10_0123    PF10_0147    PF10_0149    PF10_0340    PF11_0181    PF13_0159    PF13_0179    PF13_0205    PF13_0253    PF13_0257    PF14_0589    PFC0395w    PFC0470w    PFE0675c    PFF0610c    PFF1095w    PFI0680c    PFI1310w    PFL0900c    PFL1080c    PFL1210w    PFL2485c   


    G3DSA:3.40.50.670 - Topo_IIA_B/N_ab (Gene3D link)

    Interpro entry IPR013759 : DNA topoisomerase, type IIA, subunit B or N-terminal, alpha-beta (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents the alpha-beta domain of subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316    PFL1915w   


    G3DSA:3.40.50.720 - NAD(P)-bd (Gene3D link)

    Interpro entry IPR016040 : NAD(P)-binding (Interpro link)

    Interpro description:

    This entry represents NAD- and NADP-binding domains with a core Rossmann-type fold, which consists of 3-layers alpha/beta/alpha, where the six beta strands are parallel in the order 321456. Many different enzymes contain an NAD/NADP-binding domain, including:

    Proteins where this domain is known:
    MAL13P1.284    MAL8P1.75    PF08_0077    PF08_0132    PF10_0137    PF11_0097    PF11_0157    PF11_0407    PF11_0457    PF13_0141    PF13_0144    PF13_0182    PF13_0264    PF13_0344    PF14_0164    PF14_0286    PF14_0508    PF14_0511    PF14_0520    PFD0465c    PFD1035w    PFE0585c    PFF0730c    PFF0895w    PFF1265w    PFF1490w    PFI1125c    PFL0780w    PFL1245w    PFL1790w   


    G3DSA:3.40.50.790 - Ribosomal_L1_3-a/b-sand (Gene3D link)

    Interpro entry IPR016095 : Ribosomal protein L1, 3-layer alpha/beta-sandwich (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). This entry represents the 3-layer domain.

    In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    Proteins where this domain is known:
    PF07_0046   


    G3DSA:3.40.50.80 - G3DSA:3.40.50.80 (Gene3D link)

    Proteins where this domain is known:
    PF13_0353    PF14_0478    PFF1115w    PFI1140w   


    G3DSA:3.40.50.800 - Anticodon_bd (Gene3D link)

    Interpro entry IPR004154 : Anticodon-binding (Interpro link)

    Interpro description:
    tRNA synthetases, or tRNA ligases are involved in protein synthesis. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases it is probably the anticodon binding domain.

    Proteins where this domain is known:
    PF11_0270    PF14_0198    PF14_0428    PFL0670c   


    G3DSA:3.40.50.850 - Isochorismatase_hydro (Gene3D link)

    Interpro entry IPR000868 : Isochorismatase hydrolase (Interpro link)

    Interpro description:
    This is a family of hydrolase enzymes. Isochorismatase, also known as 2,3 dihydro-2,3 dihydroxybenzoate synthase catalyses the conversion of isochorismate, in the presence of water, to 2,3-dihydroxybenzoate and pyruvate.

    Proteins where this domain is known:
    PFC0910w   


    G3DSA:3.40.50.880 - G3DSA:3.40.50.880 (Gene3D link)

    Proteins where this domain is known:
    PF10_0123    PF13_0044    PF14_0100    PFF1335c    PFI1100w   


    G3DSA:3.40.50.920 - Transketo_C_like (Gene3D link)

    Interpro entry IPR015941 : Transketolase C-terminal-like (Interpro link)

    Interpro description:

    Transketolase C-terminal-like domains can be found in a number of different enzymes, including the C-terminal domain of the pyruvate dehydrogenase E1 component, the C-terminal domain of branched-chain alpha-keto acid dehydrogenases, and domain II of pyruvate-ferredoxin oxidoreductase (PFOR). Structural studies reveal this domain to comprise of three layers alpha/beta/alpha. The mixed beta sheet consists of five strands in the order 13245, where strand 1 is antiparallel to the others.

    Proteins where this domain is known:
    MAL13P1.186    PF14_0441    PFE0225w    PFF0530w   


    G3DSA:3.40.50.970 - G3DSA:3.40.50.970 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.186    PF08_0045    PF11_0256    PF13_0070    PF14_0441    PFE0225w    PFF0530w    PFF0945c   


    G3DSA:3.40.50.980 - G3DSA:3.40.50.980 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.485    PF07_0129    PF14_0761    PFB0685c    PFB0695c    PFC0050c    PFD0085c    PFE1250w    PFF0945c    PFF1350c    PFL0035c    PFL1880w    PFL2570w   


    G3DSA:3.40.525.10 - CRAL_bd_TRIO_C (Gene3D link)

    Interpro entry IPR001251 : (Interpro link)

    Interpro description:
    This entry defines the C-terminal of various retinaldehyde/retinal-binding proteins that may be functional components of the visual cycle. Cellular retinaldehyde-binding protein (CRALBP) carries 11-cis-retinol or 11-cis-retinaldehyde as endogenous ligands and may function as a substrate carrier protein that modulates interaction of these retinoids with visual cycle enzymes. The multidomain protein Trio binds the LAR transmembrane tyrosine phosphatase, contains a protein kinase domain, and has separate rac-specific and rho-specific guanine nucleotide exchange factor domains. Trio is a multifunctional protein that integrates and amplifies signals involved in coordinating actin remodeling, which is necessary for cell migration and growth.

    Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.

    Proteins where this domain is known:
    PF11_0287    PFF1280w    PFF1450w    PFI1015w   


    G3DSA:3.40.532.10 - Peptidase_C12 (Gene3D link)

    Interpro entry IPR001578 : Peptidase C12, ubiquitin carboxyl-terminal hydrolase 1 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.

    Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.

    The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.

    Proteins where this domain is known:
    PF11_0177    PF14_0576   


    G3DSA:3.40.630.10 - G3DSA:3.40.630.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0439    PFA0170c    PFI1570c   


    G3DSA:3.40.630.30 - Acyl_CoA_acyltransferase (Gene3D link)

    Interpro entry IPR016181 : (Interpro link)

    Interpro description:

    This entry represents a structural domain found in several acyl-CoA acyltransferase enzymes. This domain has a 3-layer alpha/beta/alpha structure that contains mixed beta-sheets, and can be found in the following proteins:

    Several proteins carry a duplication of this domain, which consists of two NAT-like domains swapped with the C-terminal strands, including:

    Proteins where this domain is known:
    MAL8P1.200    PF08_0034    PF10_0036    PF11_0192    PF13_0131    PF14_0127    PF14_0350    PFA0465c    PFD0795w   


    G3DSA:3.40.640.10 - PyrdxlP-dep_Trfase_major_sub1 (Gene3D link)

    Interpro entry IPR015421 : Pyridoxal phosphate-dependent transferase, major region, subdomain 1 (Interpro link)

    Interpro description:

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.

    This entry represents subdomain 1 of the major region of PLP-dependent transferases. This domain has a 3-layer alpha/beta/alpha sandwich topology. The major region can be found in the following PLP-dependent transferase families:

    Proteins where this domain is known:
    MAL7P1.150    PF07_0068    PF14_0155    PF14_0534    PFB0200c    PFD0285c    PFF0435w    PFL1720w    PFL2210w   


    G3DSA:3.40.710.10 - G3DSA:3.40.710.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0143   


    G3DSA:3.40.718.10 - IDH_IMDH (Gene3D link)

    Interpro entry IPR001804 : Isocitrate/isopropylmalate dehydrogenase (Interpro link)

    Interpro description:

    Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyses the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.

    3-isopropylmalate dehydrogenase (IMDH) catalyses the third step in the biosynthesis of leucine in bacteria and fungi, the oxidative decarboxylation of 3-isopropylmalate into 2-oxo-4-methylvalerate. Tartrate dehydrogenase catalyses the reduction of tartrate to oxaloglycolate.

    These enzymes are evolutionary related. The best conserved region of these enzymes is a glycine-rich stretch of residues located in the C-terminal section.

    Proteins where this domain is known:
    PF13_0242   


    G3DSA:3.40.800.10 - Ureohydrolase (Gene3D link)

    Interpro entry IPR006035 : Ureohydrolase (Interpro link)

    Interpro description:

    The ureohydrolase superfamily includes arginase, agmatinase, formiminoglutamase and proclavaminate amidinohydrolase. These enzymes share a 3-layer alpha-beta-alpha structure, and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways.

    Arginase, which catalyses the conversion of arginine to urea and ornithine, is one of the five members of the urea cycle enzymes that convert ammonia to urea as the principal product of nitrogen excretion. There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia.

    Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals.

    Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide.

    Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics.

    Proteins where this domain is known:
    PFI0320w   


    G3DSA:3.40.800.20 - His_deacetylse (Gene3D link)

    Interpro entry IPR000286 : (Interpro link)

    Interpro description:
    Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal of the acetyl group. Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are all members of this ancient protein superfamily.

    Proteins where this domain is known:
    PF10_0078    PF14_0690    PFI1260c   


    G3DSA:3.40.850.10 - kinesin_motor (Gene3D link)

    Interpro entry IPR001752 : Kinesin, motor region (Interpro link)

    Interpro description:

    Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.

    The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.

    A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:

    The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.

    The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.

    Proteins where this domain is known:
    MAL8P1.132    PF07_0104    PF11_0478    PFA0535c    PFC0770c    PFC0860w    PFL0545w    PFL2165w    PFL2190c   


    G3DSA:3.40.910.10 - Deoxyhypus_synth (Gene3D link)

    Interpro entry IPR002773 : Deoxyhypusine synthase (Interpro link)

    Interpro description:
    Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the post-translational formation of hypusine is catalysed by the enzyme deoxyhypusine synthase (DS) The enzyme catalyses the following reaction:
     Spermidine + [eIF-5A]-lysine = 1,3-diaminopropane + [eIF-5A]-deoxyhypusine 
    The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. The structure is known for this enzyme in complex with its NAD+ cofactor.

    Proteins where this domain is known:
    PF14_0125   


    G3DSA:3.40.950.10 - G3DSA:3.40.950.10 (Gene3D link)

    Proteins where this domain is known:
    PFF0685c   


    G3DSA:3.50.30.20 - G3DSA:3.50.30.20 (Gene3D link)

    Proteins where this domain is known:
    PF13_0044   


    G3DSA:3.50.50.60 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL8P1.154    PF07_0085    PF08_0066    PF08_0068    PF10_0334    PF10_0373    PF14_0192    PF14_0334    PFC0275w    PFI0735c    PFI1170c    PFL0575w    PFL1550w    PFL2060c   


    G3DSA:3.50.7.10 - G3DSA:3.50.7.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0153    PF14_0123    PFL1545c   


    G3DSA:3.50.80.10 - DTyrtRNA_deacyls (Gene3D link)

    Interpro entry IPR003732 : D-tyrosyl-tRNA(Tyr) deacylase (Interpro link)

    Interpro description:

    This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader.

    Proteins where this domain is known:
    PF11_0095   


    G3DSA:3.55.10.10 - DUF101 (Gene3D link)

    Interpro entry IPR002804 : (Interpro link)

    Interpro description:

    Proteins in this entry are found in archaea, bacteria and eukaryotes. Their function is unknown, but alignment shows several conserved polar residues which are potential catalytic residues. The structure of one of these proteins has been determined and shows homolgy to heat shock protein 33, which is a chaperone protein that inhibits the aggregation of partially denatured proteins.

    Proteins where this domain is known:
    PF14_0269   


    G3DSA:3.60.10.10 - G3DSA:3.60.10.10 (Gene3D link)

    Proteins where this domain is known:
    PF07_0024    PF11_0122    PF14_0285    PFC0250c    PFL1870c   


    G3DSA:3.60.120.10 - TRPE_1_chor_bd (Gene3D link)

    Interpro entry IPR005801 : Anthranilate synthase component I and chorismate binding protein (Interpro link)

    Interpro description:
    This entry represents the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase. Anthranilate synthase catalyses the reaction:
     chorismate + l-glutamine =  anthranilate + pyruvate + l-glutamate. 
    The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity

    Proteins where this domain is known:
    PFI1100w   


    G3DSA:3.60.15.10 - G3DSA:3.60.15.10 (Gene3D link)

    Proteins where this domain is known:
    PF07_0100    PF14_0364    PF14_0620    PF14_0711    PFC0825c    PFD0311w    PFL0285w    PFL1810w   


    G3DSA:3.60.20.10 - G3DSA:3.60.20.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.270    MAL8P1.128    MAL8P1.142    PF07_0112    PF10_0111    PF10_0245    PF13_0156    PF13_0282    PF14_0334    PF14_0676    PF14_0716    PFA0400c    PFC0395w    PFC0745c    PFE0915c    PFF0420c    PFI1545c    PFL1465c   


    G3DSA:3.60.21.10 - G3DSA:3.60.21.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.274    PF08_0129    PF10_0177    PF14_0036    PF14_0064    PF14_0142    PF14_0224    PF14_0614    PF14_0630    PF14_0660    PFA0390w    PFC0595c    PFI0880c    PFI1245c    PFI1360c    PFL0300c   


    G3DSA:3.60.40.10 - PP2C-related (Gene3D link)

    Interpro entry IPR001932 : Protein phosphatase 2C-related (Interpro link)

    Interpro description:

    This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase.

    Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1, which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.

    PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase(PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.

    Proteins where this domain is known:
    MAL13P1.44    MAL8P1.108    MAL8P1.109    PF11_0362    PF11_0396    PF14_0523    PFD0505c    PFE1010w    PFF0770c    PFL0445w    PFL2365w   


    G3DSA:3.60.90.10 - SAM_decarbox (Gene3D link)

    Interpro entry IPR016067 : S-adenosylmethionine decarboxylase, core (Interpro link)

    Interpro description:

    S-adenosylmethionine decarboxylase (AdoMetDC) catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.

    The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.

    Proteins where this domain is known:
    PF10_0322   


    G3DSA:3.65.10.20 - RNA3'_term_phos_cycl (Gene3D link)

    Interpro entry IPR000228 : (Interpro link)

    Interpro description:
    RNA cyclases are a family of RNA-modifying enzymes that are conserved in eukaryotes, bacteria and archaea. RNA 3'-terminal phosphate cyclase catalyses the conversion of 3'-phosphate to a 2',3'-cyclic phosphodiester at the end of RNA.
     ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphate 
    These enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.

    RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a glycine-rich stretch of residues located in the central part of the sequence and which is reminiscent of various ATP, GTP or AMP glycine-rich loops.

    The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin. Although the active site of this enzyme could not be unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources.

    Proteins where this domain is known:
    PF14_0677   


    G3DSA:3.70.10.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF13_0328    PFL1285c   


    G3DSA:3.75.10.10 - G3DSA:3.75.10.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0178   


    G3DSA:3.80.10.10 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.238    MAL8P1.46    PF10_0320    PF10_0420    PF11_0243    PF11_0476    PF14_0021    PF14_0257    PF14_0305    PF14_0403    PF14_0496    PF14_0651    PF14_0785    PFE0455w    PFF0595c    PFI0330c    PFI1470c    PFL1360c    PFL2380c   


    G3DSA:3.90.100.10 - Decarbxylse_C (Gene3D link)

    Interpro entry IPR008286 : Orn/Lys/Arg decarboxylase, C-terminal (Interpro link)

    Interpro description:
    Pyridoxal-dependent decarboxylases are bacterial proteins acting on ornithine, lysine, arginine and related substrates. One of the regions of sequence similarity contains a conserved lysine residue, which is the site of attachment of the pyridoxal-phosphate group.

    Proteins where this domain is known:
    PFD0285c   


    G3DSA:3.90.1010.10 - G3DSA:3.90.1010.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0518    PFB0270w   


    G3DSA:3.90.1030.10 - Ribosomal_L17 (Gene3D link)

    Interpro entry IPR000456 : Ribosomal protein L17 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L17 is one of the proteins from the large ribosomal subunit. Bacterial L17 is a protein of 120 to 130 amino-acid residues while yeast YmL8 is twice as large (238 residues). The N-terminal half of YmL8 is colinear with the sequence of L17 from Escherichia coli.

    Proteins where this domain is known:
    PF14_0289    PFE1125w   


    G3DSA:3.90.110.10 - lact_mal_DH (Gene3D link)

    Interpro entry IPR015955 : Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal (Interpro link)

    Interpro description:

    This entry represents a structural motif found at the C-terminal of lactate dehydrogenaseand malate dehydrogenases, as well as at the C-terminal of family 4 glycoside hydrolases. These domains have an unusual fold consisting of segregated alpha-helical and beta-sheet regions, although they contain predominantly anti-parallel beta-sheets.

    L-lactate dehydrogenases are metabolic enzymes that catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.

    O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Glycoside hydrolase family 4comprises enzymes with several known activities; 6-phospho-beta-glucosidase; 6-phospho-alpha-glucosidase; alpha-galactosidase.

    Proteins where this domain is known:
    PF13_0141    PF13_0144    PFF0895w   


    G3DSA:3.90.1110.10 - G3DSA:3.90.1110.10 (Gene3D link)

    Proteins where this domain is known:
    PFB0715w    PFL0330c   


    G3DSA:3.90.1120.10 - RNA_pol_Rpb1_1 (Gene3D link)

    Interpro entry IPR007080 : RNA polymerase Rpb1, domain 1 (Interpro link)

    Interpro description:

    RNA polymerases catalyse the DNA-dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which is a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand.

    Proteins where this domain is known:
    PF13_0150    PFC0805w    PFE0465c   


    G3DSA:3.90.1150.10 - PyrdxlP-dep_Trfase_major_sub2 (Gene3D link)

    Interpro entry IPR015422 : Pyridoxal phosphate-dependent transferase, major region, subdomain 2 (Interpro link)

    Interpro description:

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.

    This entry represents subdomain 2 of the major region of PLP-dependent transferases. This domain has a complex alpha/beta structure. The major region can be found in the following PLP-dependent transferase families:

    Proteins where this domain is known:
    PFF0435w   


    G3DSA:3.90.1170.10 - G3DSA:3.90.1170.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0141   


    G3DSA:3.90.1180.10 - Ribosomal_L13 (Gene3D link)

    Interpro entry IPR005822 : Ribosomal protein L13 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.

    Proteins where this domain is known:
    PF10_0043    PFB0645c   


    G3DSA:3.90.120.10 - G3DSA:3.90.120.10 (Gene3D link)

    Proteins where this domain is known:
    MAL7P1.151   


    G3DSA:3.90.1200.10 - G3DSA:3.90.1200.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0257    PF14_0020   


    G3DSA:3.90.1300.10 - Amidase_sig_enz (Gene3D link)

    Interpro entry IPR000120 : Amidase signature enzyme (Interpro link)

    Interpro description:

    Amidase signature (AS) enzymes are a large group of hydrolytic enzymes that contain a conserved stretch of approximately 130 amino acids known as the AS sequence. They are widespread, being found in both prokaryotes and eukaryotes. AS enzymes catalyse the hydrolysis of amide bonds (CO-NH2), although the family has diverged widely with regard to substrate specificity and function. Nonetheless, these enzymes maintain a core alpha/beta/alpha structure, where the topologies of the N- and C-terminal halves are similar. AS enzymes characteristically have a highly conserved C-terminal region rich in serine and glycine residues, but devoid of aspartic acid and histidine residues, therefore they differ from classical serine hydrolases. These enzymes posses a unique, highly conserved Ser-Ser-Lys catalytic triad used for amide hydrolysis, although the catalytic mechanism for acyl-enzyme intermediate formation can differ between enzymes.

    Examples of AS enzymes include:

    Proteins where this domain is known:
    PFD0780w   


    G3DSA:3.90.1410.10 - G3DSA:3.90.1410.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.25   


    G3DSA:3.90.1490.10 - G3DSA:3.90.1490.10 (Gene3D link)

    Proteins where this domain is known:
    PFL1080c   


    G3DSA:3.90.15.10 - TopoI_cat_a-hlx-sub_euk (Gene3D link)

    Interpro entry IPR014711 : DNA topoisomerase I, catalytic core, alpha-helical subdomain, eukaryotic-type (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry represents the alpha-helical subdomain that comprises part of the catalytic core of eukaryotic and viral topoisomerase I (type IB) enzymes, which occurs near the C-terminal region of the protein.

    Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.

    Vaccinia virus, a cytoplasmically-replicating poxvirus, encodes a type I DNA topoisomerase that is biochemically similar to eukaryotic-like DNA topoisomerases I, and which has been widely studied as a model topoisomerase. It is the smallest topoisomerase known and is unusual in that it is resistant to the potent chemotherapeutic agent camptothecin. The crystal structure of an amino-terminal fragment of vaccinia virus DNA topoisomerase I shows that the fragment forms a five-stranded, antiparallel beta-sheet with two short alpha-helices and connecting loops. Residues that are conserved between all eukaryotic-like type I topoisomerases are not clustered in particular regions of the structure.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PFE0520c   


    G3DSA:3.90.1550.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF13_0083    PFI1160w   


    G3DSA:3.90.170.10 - G3DSA:3.90.170.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0287   


    G3DSA:3.90.182.10 - G3DSA:3.90.182.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0491    PFA0445w   


    G3DSA:3.90.190.10 - G3DSA:3.90.190.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0139    PF13_0027    PF14_0524    PF14_0525    PFC0380w   


    G3DSA:3.90.190.20 - Mur_ligase_C (Gene3D link)

    Interpro entry IPR004101 : Mur ligase, C-terminal (Interpro link)

    Interpro description:

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.

    This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes the C-terminal domain of folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).

    The C-terminal domain is almost always associated with the cytoplasmic peptidoglycan synthetases, N-terminal domain.

    Proteins where this domain is known:
    PF13_0140   


    G3DSA:3.90.199.10 - Topo_IIA_A/C_ab (Gene3D link)

    Interpro entry IPR013758 : DNA topoisomerase, type IIA, subunit A or C-terminal, alpha-beta (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents the alpha-beta domain of subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent C-terminal region in eukaryotic topoisomerase II composed of a single polypeptide.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316    PFL1120c   


    G3DSA:3.90.20.20 - GrpE_coiled_coil (Gene3D link)

    Interpro entry IPR013805 : GrpE nucleotide exchange factor, coiled-coil (Interpro link)

    Interpro description:

    In prokaryotes, the nucleotide exchange factor GrpE and the chaperone DnaJ are required for nucleotide binding of the molecular chaperone DnaK. The DnaK reaction cycle involves rapid peptide binding and release, which is dependent upon nucleotide binding. DnaJ accelerates the hydrolysis of ATP by DnaK, which enables the ADP-bound DnaK to tightly bind peptide. GrpE catalyses the release of ADP from DnaK, which is required for peptide release. In eukaryotes, GrpE is essential for mitochondrial Hsp70 function, however the cytosolic Hsp70 homologues are GrpE-independent.

    GrpE binds as a homodimer to the ATPase domain of DnaK, and may interact with the peptide-binding domain of DnaK. GrpE accomplishes nucleotide exchange by opening the nucleotide-binding cleft of DnaK. GrpE is comprised of two domains, the N-terminal coiled coil domain, which may facilitate peptide release, and the C-terminal head domain, which forms part of the contact surface with the ATPase domain of DnaK. This entry represents the N-terminal coiled-coil domain.

    Proteins where this domain is known:
    PF11_0258   


    G3DSA:3.90.226.10 - G3DSA:3.90.226.10 (Gene3D link)

    Proteins where this domain is known:
    PF10_0167    PF14_0232    PF14_0348    PF14_0664    PFC0310c    PFL1940w   


    G3DSA:3.90.228.20 - PEP_carboxykinase_C (Gene3D link)

    Interpro entry IPR013035 : Phosphoenolpyruvate carboxykinase, C-terminal (Interpro link)

    Interpro description:

    Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.

    PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.

    PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.

    This entry represents the C-terminal domain found in both GTP-utilising and ATP-utilising phosphoenolpyruvate carboxykinase enzymes.

    Proteins where this domain is known:
    PF13_0234   


    G3DSA:3.90.230.10 - Peptidase_M24_cat_core (Gene3D link)

    Interpro entry IPR000994 : Peptidase M24, structural domain (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.

    The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.

    The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.

    Proteins where this domain is known:
    MAL8P1.140    PF10_0150    PF14_0261    PF14_0327    PF14_0517    PFE0870w    PFE1360c   


    G3DSA:3.90.244.10 - G3DSA:3.90.244.10 (Gene3D link)

    Proteins where this domain is known:
    PF14_0352   


    G3DSA:3.90.280.10 - PEBP (Gene3D link)

    Interpro entry IPR008914 : (Interpro link)

    Interpro description:

    The PEBP (PhosphatidylEthanolamine-Binding Protein) family is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development, serine protease inhibition, the control of the morphological switch between shoot growth and flower structures, and the regulation of several signalling pathways such as the MAP kinase pathway, and the NF-kappaB pathway. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN).

    Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central beta-sheet flanked by a smaller beta-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.

    Proteins where this domain is known:
    PFC0176c    PFL0955c   


    G3DSA:3.90.45.10 - Fmet_deformylase (Gene3D link)

    Interpro entry IPR000181 : Formylmethionine deformylase (Interpro link)

    Interpro description:

    Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::

     N-formyl-L-methionine + H2O = formate + methionyl peptide 
    Catalytic efficiency strongly depends on the identity of the bound metal.

    The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .

    Proteins where this domain is known:
    PFI0380c   


    G3DSA:3.90.470.10 - Ribosomal_L22 (Gene3D link)

    Interpro entry IPR001063 : Ribosomal protein L22/L17 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of ribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22; archaebacterial L22; mammalian L17; plant L17 and yeast YL17.

    Proteins where this domain is known:
    PF13_0268    PF14_0642   


    G3DSA:3.90.470.20 - G3DSA:3.90.470.20 (Gene3D link)

    Proteins where this domain is known:
    PFD0980w   


    G3DSA:3.90.530.10 - G3DSA:3.90.530.10 (Gene3D link)

    Proteins where this domain is known:
    MAL7P1.32   


    G3DSA:3.90.550.10 - G3DSA:3.90.550.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.144    MAL13P1.218    PF11_0427    PF14_0774    PFA0340w    PFE0875c   


    G3DSA:3.90.640.10 - G3DSA:3.90.640.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.540    MAL7P1.228    PF07_0033    PF08_0054    PF11_0114    PF11_0351    PF14_0124    PFA0190c    PFI0875w    PFL2215w   


    G3DSA:3.90.660.20 - no description (Gene3D link)

    Proteins where this domain is known:
    PF10_0275   


    G3DSA:3.90.70.10 - G3DSA:3.90.70.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0161    PF11_0162    PF11_0165    PF11_0174    PF14_0553    PFB0325c    PFB0330c    PFB0335c    PFB0340c    PFB0345c    PFB0350c    PFB0355c    PFB0360c    PFD0230c    PFI0135c    PFL2290w   


    G3DSA:3.90.79.10 - NUDIX_hydrolase (Gene3D link)

    Interpro entry IPR000086 : NUDIX hydrolase, core (Interpro link)

    Interpro description:
    MutT is a small bacterial protein (~12-15Kd) involved in the GO system responsible for removing an oxidatively damaged form of guanine (8-hydroxy- guanine or 7,8-dihydro-8-oxoguanine) from DNA and the nucleotide pool. 8-oxo-dGTP is inserted opposite dA and dC residues of template DNA with near equal efficiency, leading to A.T to G.C transversions. MutT specifically degrades 8-oxo-dGTP to the monophosphate, with the concomitant release of pyrophosphate. A short conserved N-terminal region of mutT (designated the MutT domain) is also found in a variety of other prokaryotic, viral and eukaryotic proteins.

    The generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linked to some other moiety X) has been coined for this domain family. The family can be divided into a number of subgroups, of which MutT anti- mutagenic activity represents only one type; most of the rest hydrolyse diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP- mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).

    Proteins where this domain is known:
    MAL13P1.248    PF13_0048    PFE1035c   


    G3DSA:3.90.80.10 - Pyrophosphatase (Gene3D link)

    Interpro entry IPR008162 : Inorganic pyrophosphatase (Interpro link)

    Interpro description:

    Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.

    The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.

    Proteins where this domain is known:
    PFC0710w   


    G3DSA:3.90.800.10 - G3DSA:3.90.800.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0170   


    G3DSA:3.90.830.10 - G3DSA:3.90.830.10 (Gene3D link)

    Proteins where this domain is known:
    PFB0750w    PFF0665c   


    G3DSA:3.90.870.10 - DHBP_synth_RibB-like_a/b_dom (Gene3D link)

    Interpro entry IPR017945 : (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of segregated alpha and beta regions in 3-layers. Homologous domains with this structure are found in:

    DHBP synthase RibB catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin. No sequences with significant homology to DHBP synthase are found in the metazoa.

    The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.

    Proteins where this domain is known:
    PFL0175c   


    G3DSA:3.90.920.10 - no description (Gene3D link)

    Proteins where this domain is known:
    PF14_0366   


    G3DSA:3.90.930.12 - Ribosomal_L6 (Gene3D link)

    Interpro entry IPR000702 : Ribosomal protein L6 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.

    Proteins where this domain is known:
    PF13_0129   


    G3DSA:3.90.940.10 - RNAP_RPB6_omega (Gene3D link)

    Interpro entry IPR012293 : RNA polymerase subunit, RPB6/omega (Interpro link)

    Interpro description:

    Prokaryotes contain a single RNA polymerase (RNAP) that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) with specific transcriptional roles. In eukaryotes, the RPB6 subunit is common to all three polymerases. RPB6 is involved in the initiation of transcription. Bacterial DNA-dependent RNAP contains a small subunit termed omega, where the complete RNAP composition is beta'-beta-alpha(I)-alpha(II)-omega. The bacterial omega subunit is homologous in sequence and structure to the eukaryotic RPB6 subunit; they also have similar functional roles, being able to promote RNA polymerase assembly, possibly through a latching mechanism.

    Proteins where this domain is known:
    PFC0155c   


    G3DSA:3.90.940.20 - RNApol_RPB5 (Gene3D link)

    Interpro entry IPR000783 : RNA polymerase, subunit H/Rpb5 C-terminal (Interpro link)

    Interpro description:

    Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).

    This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2).

    Proteins where this domain is known:
    PF13_0341   


    G3DSA:3.90.950.10 - G3DSA:3.90.950.10 (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.69    MAL7P1.110    PFI0310w   


    G3DSA:3.90.960.10 - YbaK/aa-tRNA-synth-assoc-reg (Gene3D link)

    Interpro entry IPR007214 : (Interpro link)

    Interpro description:
    This domain of unknown function is found in numerous prokaryote organisms. The structure of YbaK shows a novel fold. This domain also occurs in a number of prolyl-tRNA synthetases (proRS) from prokaryotes. Thus, the domain is thought to be involved in oligonucleotide binding, with possible roles in recognition/discrimination or editing of prolyl-tRNA.

    Proteins where this domain is known:
    PFL0670c   


    G3DSA:4.10.1000.10 - G3DSA:4.10.1000.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0357    PF14_0236    PF14_0416    PFE1245w    PFL0510c   


    G3DSA:4.10.1030.10 - G3DSA:4.10.1030.10 (Gene3D link)

    Proteins where this domain is known:
    PF08_0094   


    G3DSA:4.10.1050.10 - no description (Gene3D link)

    Proteins where this domain is known:
    MAL13P1.115    MAL8P1.380   


    G3DSA:4.10.1060.10 - G3DSA:4.10.1060.10 (Gene3D link)

    Proteins where this domain is known:
    PF13_0278    PFD0405c   


    G3DSA:4.10.1110.10 - Znf_AN1 (Gene3D link)

    Interpro entry IPR000058 : Zinc finger, AN1-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 a ubiquitin-like protein in Xenopus laevis. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.

    Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16). For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway. Proteins containing an AN1-type zinc finger include:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0056    PFE0200c   


    G3DSA:4.10.520.10 - Hist_DNA_bd_bac (Gene3D link)

    Interpro entry IPR000119 : Histone-like bacterial DNA-binding protein (Interpro link)

    Interpro description:

    Bacteria synthesize a set of small, usually basic proteins of about 90 residues that bind DNA and are known as histone-like proteins. Examples include the HU protein in Escherichia coli is a dimer of closely related alpha and beta chains and in other bacteria can be a dimer of identical chains. HU-type proteins have been found in a variety of eubacteria, cyanobacteria and archaebacteria, and are also encoded in the chloroplast genome of some algae. The integration host factor (IHF), a dimer of closely related chains which seem to function in genetic recombination as well as in translational and transcriptional control is found in enterobacteria and viral proteins include the African Swine fever virus protein A104R (or LMW5-AR).

    The exact function of these proteins is not yet clear but they are capable of wrapping DNA and stabilizing it from denaturation under extreme environmental conditions. The structure is known for one of these proteins. The protein exists as a dimer and two "beta-arms" function as the non-specific binding site for bacterial DNA.

    Proteins where this domain is known:
    PFI0230c   


    G3DSA:4.10.640.10 - Ribosomal_S18 (Gene3D link)

    Interpro entry IPR001648 : Ribosomal protein S18 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.

    Proteins where this domain is known:
    PFL0570c   


    G3DSA:4.10.830.10 - G3DSA:4.10.830.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0386   


    G3DSA:4.10.910.10 - G3DSA:4.10.910.10 (Gene3D link)

    Proteins where this domain is known:
    PF11_0272   


    G3DSA:4.10.950.10 - Ribosomal_L2 (Gene3D link)

    Interpro entry IPR014726 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents domain 3 of the ribosomal protein L2 from the large 50S subunit. The 50S subunit proteins function primarily to stabilize inter-domain interactions that are necessary to maintain the subunit's structural integrity, displaying a wide variety of protein-RNA interactions. This domain has an irregular structure.

    Proteins where this domain is known:
    PFE0845c   


    PD000001 - Prot_kinase (Prodom link)

    Proteins where this domain is known:
    MAL13P1.114    MAL13P1.185    MAL13P1.196    MAL13P1.267    MAL13P1.278    MAL13P1.279    MAL13P1.84    MAL7P1.100    MAL7P1.127    MAL7P1.132    MAL7P1.144    MAL7P1.175    MAL7P1.18    MAL7P1.26    MAL7P1.73    MAL8P1.203    MAL8P1.42    PF07_0072    PF08_0044    PF10_0141    PF10_0160    PF10_0380    PF11_0060    PF11_0079    PF11_0096    PF11_0127    PF11_0147    PF11_0156    PF11_0220    PF11_0227    PF11_0239    PF11_0242    PF11_0377    PF11_0464    PF11_0488    PF11_0510    PF13_0085    PF13_0166    PF13_0211    PF13_0258    PF14_0227    PF14_0264    PF14_0294    PF14_0346    PF14_0392    PF14_0408    PF14_0423    PF14_0431    PF14_0476    PF14_0516    PF14_0552    PF14_0734    PFA0130c    PFA0380w    PFB0150c    PFB0520w    PFB0605w    PFB0665w    PFB0815w    PFC0060c    PFC0105w    PFC0385c    PFC0420w    PFC0485w    PFC0525c    PFC0755c    PFC0945w    PFD0740w    PFD0865c    PFD0975w    PFD1165w    PFD1175w    PFE0045c    PFE0170c    PFE1290w    PFF0260w    PFF0520w    PFF0750w    PFF1145c    PFF1370w    PFI0095c    PFI0100c    PFI0105c    PFI0110c    PFI0115c    PFI0120c    PFI0125c    PFI1275w    PFI1280c    PFI1290w    PFI1415w    PFI1685w    PFL0040c    PFL0080c    PFL1370w    PFL1885c    PFL2250c    PFL2280w   


    PD000006 - ABC_transporter (Prodom link)

    Proteins where this domain is known:
    MAL13P1.344    MAL13P1.96    PF11_0225    PF11_0466    PF13_0218    PF13_0271    PF14_0133    PF14_0244    PF14_0321    PF14_0455    PFC0125w    PFD0685c    PFE1150w    PFL0495c   


    PD000012 - EF-hand (Prodom link)

    Proteins where this domain is known:
    MAL7P1.10    MAL7P1.69    PF07_0072    PF10_0271    PF10_0301    PF11_0066    PF11_0098    PF13_0211    PF14_0323    PF14_0420    PF14_0443    PF14_0492    PFA0345w    PFB0815w    PFC0420w    PFF0265c    PFF0520w    PFF1320c    PFL2225w   


    PD000018 - WD40 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.264    MAL7P1.81    MAL8P1.43    PF08_0019    PF08_0130    PF10_0261    PF10_0326    PF11_0171    PF11_0471    PF14_0263    PF14_0456    PFC0100c    PFC0365w    PFD0455w    PFE0090w    PFE0540w    PFE0930w    PFF0330w    PFI0290c    PFL0470w    PFL0970w    PFL1480w    PFL1975c    PFL2460w   


    PD000089 - Hsp70 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.540    MAL7P1.228    PF08_0054    PF11_0351    PFI0875w   


    PD000131 - Copper_CuA (Prodom link)

    Proteins where this domain is known:
    PF14_0288   


    PD000139 - FAD_pyr_redox (Prodom link)

    Proteins where this domain is known:
    PF07_0085    PF08_0066    PF14_0192    PF14_0334    PFI1170c    PFL1550w   


    PD000158 - Peptidase_C1 (Prodom link)

    Proteins where this domain is known:
    PF11_0161    PF11_0162    PF11_0165    PF11_0174    PF14_0553    PFB0325c    PFD0230c    PFI0135c    PFL2290w   


    PD000252 - T_phtase_apaH (Prodom link)

    Proteins where this domain is known:
    MAL13P1.274    PF08_0129    PF10_0177    PF14_0142    PF14_0630    PFC0595c    PFI1245c    PFI1360c   


    PD000288 - Aldo/ket_red (Prodom link)

    Proteins where this domain is known:
    MAL13P1.324    PF14_0088   


    PD000295 - Q8WPZ6_PLAFA_Q8WPZ6; (Prodom link)

    Proteins where this domain is known:
    PF11_0338   


    PD000355 - Myosin_head (Prodom link)

    Proteins where this domain is known:
    MAL13P1.148    PF11_0416    PF13_0233    PFE0175c   


    PD000375 - Cyt_CIAB (Prodom link)

    Proteins where this domain is known:
    PF14_0038   


    PD000395 - Kringle (Prodom link)

    Proteins where this domain is known:
    PFI0550w   


    PD000461 - UBQ_conjugat (Prodom link)

    Proteins where this domain is known:
    MAL13P1.227    PF08_0085    PF10_0330    PF13_0301    PF14_0128    PFC0255c    PFC0855w    PFE1350c    PFF0305c    PFI0740c    PFI1030c    PFL0190w    PFL2100w    PFL2175w   


    PD000475 - Q6J6P7_PLAFA_Q6J6P7; (Prodom link)

    Proteins where this domain is known:
    PF08_0071    PFF1130c   


    PD000497 - Histone_H2B (Prodom link)

    Proteins where this domain is known:
    PF07_0054    PF11_0062   


    PD000511 - Q9Y007_PLAFA_Q9Y007; (Prodom link)

    Proteins where this domain is known:
    PF13_0229   


    PD000522 - O22430_PINTA_O22430; (Prodom link)

    Proteins where this domain is known:
    PFC0920w    PFF0860c   


    PD000566 - Chaprnin_Cpn10 (Prodom link)

    Proteins where this domain is known:
    PF13_0180    PFL0740c   


    PD000600 - 14-3-3 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.309    MAL8P1.69    PF14_0220   


    PD000612 - Cyt_B5 (Prodom link)

    Proteins where this domain is known:
    PF14_0266    PFI0885w    PFL1555w   


    PD000657 - Adenylate_kin (Prodom link)

    Proteins where this domain is known:
    PF10_0086    PFA0555c    PFD0755c   


    PD000707 - Ppfruckinase (Prodom link)

    Proteins where this domain is known:
    PF11_0294    PFI0755c   


    PD000742 - DNA_topoisoIV (Prodom link)

    Proteins where this domain is known:
    PF14_0316    PFL1120c   


    PD000779 - Anth_synth_chor (Prodom link)

    Proteins where this domain is known:
    PFI1100w   


    PD000817 - Ribosomal_S7 (Prodom link)

    Proteins where this domain is known:
    PF07_0088   


    PD000819 - SRP54 (Prodom link)

    Proteins where this domain is known:
    PF13_0350    PF14_0477    PFL0075w   


    PD000865 - Euk_COanhd (Prodom link)

    Proteins where this domain is known:
    PF11_0410    PF11_0411   


    PD000887 - Acyl_carrier (Prodom link)

    Proteins where this domain is known:
    PFB0385w    PFL0415w   


    PD000902 - Enolase (Prodom link)

    Proteins where this domain is known:
    PF08_0078    PF10_0155   


    PD000944 - ATPsynt_DE (Prodom link)

    Proteins where this domain is known:
    PF11_0485   


    PD000945 - Bac_DNAbind (Prodom link)

    Proteins where this domain is known:
    PFI0230c   


    PD001005 - TPIS_PLAFA_Q07412; (Prodom link)

    Proteins where this domain is known:
    PF14_0378    PFC0831w   


    PD001009 - Pyruvate_kinase (Prodom link)

    Proteins where this domain is known:
    PF10_0363    PFF1300w   


    PD001010 - Ribosomal_S11 (Prodom link)

    Proteins where this domain is known:
    PFE0810c   


    PD001012 - Ribosomal_S19 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.92   


    PD001018 - NDK (Prodom link)

    Proteins where this domain is known:
    PF13_0349    PFF0275c   


    PD001032 - Ribosomal_L22 (Prodom link)

    Proteins where this domain is known:
    PF13_0268    PF14_0642   


    PD001041 - MCM (Prodom link)

    Proteins where this domain is known:
    PF07_0023    PF13_0095    PF13_0291    PF14_0177    PFE1345c    PFL0580w   


    PD001057 - Gln_synt_C (Prodom link)

    Proteins where this domain is known:
    PFI1110w   


    PD001081 - FA_desat_sub (Prodom link)

    Proteins where this domain is known:
    PFE0555w   


    PD001093 - Ribosomal_L14 (Prodom link)

    Proteins where this domain is known:
    PF13_0171    PFE0960w   


    PD001098 - Ribosomal_S8 (Prodom link)

    Proteins where this domain is known:
    MAL7P1.93    PFC0735w   


    PD001109 - Hexokinase (Prodom link)

    Proteins where this domain is known:
    PFF1155w   


    PD001115 - 2Oxoacid_dh (Prodom link)

    Proteins where this domain is known:
    PF10_0407    PF13_0121    PFC0170c   


    PD001128 - Q9U6R2_PLAFA_Q9U6R2; (Prodom link)

    Proteins where this domain is known:
    PF14_0425   


    PD001129 - Q8IKU0_PLAF7_Q8IKU0; (Prodom link)

    Proteins where this domain is known:
    PF14_0511   


    PD001141 - Ribosomal_L23 (Prodom link)

    Proteins where this domain is known:
    PF13_0132    PFL1895w   


    PD001180 - Q8I1R6_PLAF7_Q8I1R6; (Prodom link)

    Proteins where this domain is known:
    PFD0830w   


    PD001188 - Asucc_synthtase (Prodom link)

    Proteins where this domain is known:
    PF13_0287   


    PD001202 - PI_PLC_Y (Prodom link)

    Proteins where this domain is known:
    PF10_0132   


    PD001229 - Synaptobrevin (Prodom link)

    Proteins where this domain is known:
    MAL13P1.135    MAL13P1.16   


    PD001243 - SpoU_mtfrase (Prodom link)

    Proteins where this domain is known:
    PF10_0300    PF14_0273    PFB0855c    PFE1275c   


    PD001263 - MutS_C (Prodom link)

    Proteins where this domain is known:
    MAL7P1.206    PF14_0254    PFE0270c   


    PD001272 - Ribosomal_S10 (Prodom link)

    Proteins where this domain is known:
    PF10_0038    PF14_0581   


    PD001278 - NAD_Gly3P_C (Prodom link)

    Proteins where this domain is known:
    PF11_0157    PFL0780w   


    PD001295 - Ribosomal_S17 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.327    PFC0775w   


    PD001314 - Ribosomal_L1 (Prodom link)

    Proteins where this domain is known:
    PF07_0046    PF14_0391    PFL0500w   


    PD001326 - Ribosomal_L12 (Prodom link)

    Proteins where this domain is known:
    PFB0545c    PFE1225w   


    PD001363 - Ribosomal_S13 (Prodom link)

    Proteins where this domain is known:
    PF11_0272   


    PD001367 - Ribosomal_L11 (Prodom link)

    Proteins where this domain is known:
    PF11_0113   


    PD001374 - Ribosomal_L3 (Prodom link)

    Proteins where this domain is known:
    PFI0890c    PFL2180w   


    PD001394 - Ribosomal_L18p (Prodom link)

    Proteins where this domain is known:
    PF14_0230    PFF0650w   


    PD001511 - IGPS (Prodom link)

    Proteins where this domain is known:
    MAL13P1.319   


    PD001589 - U_glycsylse_notp (Prodom link)

    Proteins where this domain is known:
    PF14_0148   


    PD001627 - Ribosomal_S9 (Prodom link)

    Proteins where this domain is known:
    PF08_0076    PF11_0382    PF14_0132   


    PD001677 - Ribosomal_L24 (Prodom link)

    Proteins where this domain is known:
    PFF0245w    PFL1150c   


    PD001791 - Ribosomal_L13 (Prodom link)

    Proteins where this domain is known:
    PFB0645c   


    PD001819 - PseudoU_synth (Prodom link)

    Proteins where this domain is known:
    PFE0570w    PFE1080w    PFI0685w   


    PD001827 - Q8IIV2_PLAF7_Q8IIV2; (Prodom link)

    Proteins where this domain is known:
    PF11_0061   


    PD001861 - Nramp (Prodom link)

    Proteins where this domain is known:
    PFE1185w   


    PD001961 - Aminotrans_IV (Prodom link)

    Proteins where this domain is known:
    PF14_0557   


    PD001963 - Botulinum (Prodom link)

    Proteins where this domain is known:
    MAL13P1.25    MAL7P1.86    MAL8P1.37    MAL8P1.47    PF07_0021    PF07_0042    PF08_0032    PF10_0312    PF11_0528    PF14_0252    PF14_0406    PF14_0419    PF14_0648    PFD0872w    PFI0850w   


    PD002014 - Inorg_pphsph (Prodom link)

    Proteins where this domain is known:
    PFC0710w   


    PD002096 - PD002096 (Prodom link)

    Proteins where this domain is known:
    PF14_0097   


    PD002183 - HesB_yadR_yfhF (Prodom link)

    Proteins where this domain is known:
    PFB0320c    PFC1005c    PFE1135w   


    PD002221 - Desaturase (Prodom link)

    Proteins where this domain is known:
    PFE0555w   


    PD002239 - Ribosomal_S18 (Prodom link)

    Proteins where this domain is known:
    PFL0570c   


    PD002276 - Toprim_primase (Prodom link)

    Proteins where this domain is known:
    MAL8P1.105   


    PD002304 - Q7RGP4_PLAYO_Q7RGP4; (Prodom link)

    Proteins where this domain is known:
    PF14_0381   


    PD002367 - Peptidase_M22 (Prodom link)

    Proteins where this domain is known:
    PF10_0299   


    PD002379 - SAM_decarbox (Prodom link)

    Proteins where this domain is known:
    PF10_0322   


    PD002389 - Ribosomal_L20 (Prodom link)

    Proteins where this domain is known:
    PF14_0709   


    PD002395 - PD002395 (Prodom link)

    Proteins where this domain is known:
    PF14_0027   


    PD002595 - Ribosomal_L33 (Prodom link)

    Proteins where this domain is known:
    MAL8P1.110    PFB0467w   


    PD002667 - Ribosomal_S4E (Prodom link)

    Proteins where this domain is known:
    PF11_0065   


    PD002673 - Q8WSN0_PLAFA_Q8WSN0; (Prodom link)

    Proteins where this domain is known:
    PF13_0328    PFL1285c   


    PD002745 - Q7RNI8_PLAYO_Q7RNI8; (Prodom link)

    Proteins where this domain is known:
    PFL0480w   


    PD002792 - Ferrochelatase (Prodom link)

    Proteins where this domain is known:
    MAL13P1.326   


    PD002830 - NifU_C (Prodom link)

    Proteins where this domain is known:
    PFI1050c    PFI1835c   


    PD002841 - Ribosomal_L44E (Prodom link)

    Proteins where this domain is known:
    PFC0200w   


    PD002880 - IF3 (Prodom link)

    Proteins where this domain is known:
    MAL8P1.27   


    PD002941 - Chorismate_synth (Prodom link)

    Proteins where this domain is known:
    PFF1105c   


    PD002979 - Ribosomal_L19 (Prodom link)

    Proteins where this domain is known:
    PFF0495w   


    PD003035 - Ribosomal_S3AE (Prodom link)

    Proteins where this domain is known:
    PFC1020c   


    PD003041 - Znf_DHHC (Prodom link)

    Proteins where this domain is known:
    MAL13P1.117    MAL13P1.126    MAL7P1.68    PF10_0273    PF11_0167    PF11_0217    PFB0140w    PFB0725c    PFC0160w    PFE1415w    PFF0485c    PFI1580c   


    PD003114 - Ribosomal_L27 (Prodom link)

    Proteins where this domain is known:
    PF10_0332    PFC0701w   


    PD003210 - Herpes_UL6 (Prodom link)

    Proteins where this domain is known:
    PFB0730w   


    PD003225 - Q6LFJ3_PLAF7_Q6LFJ3; (Prodom link)

    Proteins where this domain is known:
    PF11_0236    PF11_0453    PFF0360w   


    PD003329 - Depp_CoAkinase (Prodom link)

    Proteins where this domain is known:
    PF14_0415   


    PD003330 - Q6PSS5_PLACH_Q6PSS5; (Prodom link)

    Proteins where this domain is known:
    PFL1155w   


    PD003417 - Ribosomal_L35 (Prodom link)

    Proteins where this domain is known:
    PFI0375w   


    PD003460 - Ribosomal_S6E (Prodom link)

    Proteins where this domain is known:
    PF13_0228   


    PD003461 - UPP_synth (Prodom link)

    Proteins where this domain is known:
    MAL8P1.22   


    PD003604 - Ribosomal_L21p (Prodom link)

    Proteins where this domain is known:
    PF08_0014    PF14_0212   


    PD003662 - FAD_Synth (Prodom link)

    Proteins where this domain is known:
    MAL13P1.292   


    PD003697 - Q9VAR1_DROME_Q9VAR1; (Prodom link)

    Proteins where this domain is known:
    PFC0635c   


    PD003738 - GIDA (Prodom link)

    Proteins where this domain is known:
    PFL2115c   


    PD003791 - Ribosomal_S16 (Prodom link)

    Proteins where this domain is known:
    PFE1560c   


    PD003809 - Ribosomal_S6 (Prodom link)

    Proteins where this domain is known:
    PFI1585c   


    PD003823 - Ribosomal_L32E (Prodom link)

    Proteins where this domain is known:
    PFI0190w   


    PD003829 - CAS_kinase_II (Prodom link)

    Proteins where this domain is known:
    PF11_0048    PF13_0232   


    PD003844 - Fmet_deformylase (Prodom link)

    Proteins where this domain is known:
    PFI0380c   


    PD003854 - Ribosomal_S19E (Prodom link)

    Proteins where this domain is known:
    PFD1055w   


    PD003992 - XGLTT_domain (Prodom link)

    Proteins where this domain is known:
    PF13_0355   


    PD004078 - eIF5_eIF2B (Prodom link)

    Proteins where this domain is known:
    PF10_0103    PFL0335c   


    PD004103 - RRF (Prodom link)

    Proteins where this domain is known:
    PFB0390w    PFE0530w   


    PD004104 - Nop (Prodom link)

    Proteins where this domain is known:
    PF10_0085    PF11_0191    PFD0450c   


    PD004109 - IPP_isomerase (Prodom link)

    Proteins where this domain is known:
    MAL13P1.248   


    PD004122 - ATPsynt_Dsub (Prodom link)

    Proteins where this domain is known:
    PF13_0227   


    PD004277 - Ribosomal_L17 (Prodom link)

    Proteins where this domain is known:
    PF14_0289    PFE1125w   


    PD004282 - ACPS (Prodom link)

    Proteins where this domain is known:
    PFD0980w   


    PD004329 - TCTP (Prodom link)

    Proteins where this domain is known:
    PFE0545c   


    PD004330 - Q7KQK1_PLAF7_Q7KQK1; (Prodom link)

    Proteins where this domain is known:
    PFL0955c   


    PD004390 - FAD_binding_N (Prodom link)

    Proteins where this domain is known:
    PFE0675c   


    PD004399 - Diphthamide_syn (Prodom link)

    Proteins where this domain is known:
    PF14_0136   


    PD004443 - Ribosomal_L13E (Prodom link)

    Proteins where this domain is known:
    PF08_0075   


    PD004466 - Ribosomal_S27E (Prodom link)

    Proteins where this domain is known:
    PF13_0045   


    PD004495 - Ribosomal_L7A (Prodom link)

    Proteins where this domain is known:
    PF10_0187    PF11_0250    PF14_0231    PFC0295c    PFD0960c   


    PD004563 - Fizzy (Prodom link)

    Proteins where this domain is known:
    PF10_0261   


    PD004637 - Fibrillarin (Prodom link)

    Proteins where this domain is known:
    PF14_0068   


    PD004674 - IPPT (Prodom link)

    Proteins where this domain is known:
    PFL0380c   


    PD004738 - Q6FRR0_EEEEE_Q6FRR0; (Prodom link)

    Proteins where this domain is known:
    PF13_0234   


    PD004816 - Q8IHI7_BRUMA_Q8IHI7; (Prodom link)

    Proteins where this domain is known:
    PFL1420w   


    PD004823 - Ribosomal_L19e (Prodom link)

    Proteins where this domain is known:
    PFF0700c   


    PD004958 - Snz1p/Sor1 (Prodom link)

    Proteins where this domain is known:
    PFF1025c   


    PD005043 - DAGKc (Prodom link)

    Proteins where this domain is known:
    PF14_0681    PFI1485c   


    PD005132 - Q7RKV5_PLAYO_Q7RKV5; (Prodom link)

    Proteins where this domain is known:
    MAL7P1.320   


    PD005145 - Dynein_light1 (Prodom link)

    Proteins where this domain is known:
    MAL7P1.161    PFL0660w   


    PD005155 - RNA_pol_H_23kD (Prodom link)

    Proteins where this domain is known:
    PF13_0341   


    PD005242 - NusB_region (Prodom link)

    Proteins where this domain is known:
    PF11_0305    PFL1475w   


    PD005267 - Ribosomal_NusG (Prodom link)

    Proteins where this domain is known:
    PFF0535c   


    PD005388 - IPPtrans_like (Prodom link)

    Proteins where this domain is known:
    PFL0380c   


    PD005541 - Ribosomal_S28e (Prodom link)

    Proteins where this domain is known:
    PF14_0585   


    PD005579 - TIF_eIF-1A (Prodom link)

    Proteins where this domain is known:
    PF11_0447   


    PD005595 - DUF59 (Prodom link)

    Proteins where this domain is known:
    PF11_0296   


    PD005639 - Znf_ZPR1 (Prodom link)

    Proteins where this domain is known:
    PF13_0313   


    PD005653 - DTyrtRNA_deacyls (Prodom link)

    Proteins where this domain is known:
    PF11_0095   


    PD005658 - Ribosomal_S8E (Prodom link)

    Proteins where this domain is known:
    MAL7P1.24    PF14_0083   


    PD005774 - ERD2_PLAFA_P33948; (Prodom link)

    Proteins where this domain is known:
    MAL13P1.163    PF13_0280   


    PD005813 - RpiA (Prodom link)

    Proteins where this domain is known:
    PFE0730c   


    PD006030 - Ribosomal_L31e (Prodom link)

    Proteins where this domain is known:
    PFE0185c   


    PD006052 - Ribosomal_S24E (Prodom link)

    Proteins where this domain is known:
    PFE0975c   


    PD006086 - Lipoate_B (Prodom link)

    Proteins where this domain is known:
    MAL8P1.37   


    PD006217 - EF1_G (Prodom link)

    Proteins where this domain is known:
    PF13_0214   


    PD006276 - Ribosomal_S7E (Prodom link)

    Proteins where this domain is known:
    PF13_0014   


    PD006364 - Q7RG18_PLAYO_Q7RG18; (Prodom link)

    Proteins where this domain is known:
    MAL13P1.176    PF13_0198    PFD0850c   


    PD006539 - RNA_pol_N (Prodom link)

    Proteins where this domain is known:
    PF07_0027   


    PD006584 - Ribosomal_S21E (Prodom link)

    Proteins where this domain is known:
    PF11_0454   


    PD006587 - O48857_SOLNI_O48857; (Prodom link)

    Proteins where this domain is known:
    MAL8P1.205    PFF0920c   


    PD006591 - Ribosomal_L37ae (Prodom link)

    Proteins where this domain is known:
    PFB0455w   


    PD006609 - SRP19 (Prodom link)

    Proteins where this domain is known:
    PFL0785c   


    PD006662 - S10_plectin_N (Prodom link)

    Proteins where this domain is known:
    PF07_0080   


    PD006880 - eIF6 (Prodom link)

    Proteins where this domain is known:
    PF13_0178   


    PD006960 - F-actin_cap_A (Prodom link)

    Proteins where this domain is known:
    PFE1420w   


    PD007262 - Trans_pterinDh (Prodom link)

    Proteins where this domain is known:
    PF11_0095a   


    PD007270 - COX5B (Prodom link)

    Proteins where this domain is known:
    PFI1365w   


    PD007306 - Ribosomal_L22e (Prodom link)

    Proteins where this domain is known:
    PF08_0039   


    PD007661 - Znf_constans (Prodom link)

    Proteins where this domain is known:
    PF14_0383    PFE0895c   


    PD007711 - FAD_binding_C (Prodom link)

    Proteins where this domain is known:
    PFE0675c   


    PD007730 - Deoxyhypus_synth (Prodom link)

    Proteins where this domain is known:
    PF14_0125   


    PD007914 - Q9SRT6_ARATH_Q9SRT6; (Prodom link)

    Proteins where this domain is known:
    PFF0573c   


    PD008105 - Enh_rudimentary (Prodom link)

    Proteins where this domain is known:
    PF10_0370   


    PD008148 - TFAR19-related (Prodom link)

    Proteins where this domain is known:
    PFI0450c   


    PD008153 - UCR_14kDa (Prodom link)

    Proteins where this domain is known:
    PF10_0120   


    PD008617 - Factin_cap_beta (Prodom link)

    Proteins where this domain is known:
    PFE0880c   


    PD008800 - Peptidase_C57 (Prodom link)

    Proteins where this domain is known:
    PFL1175w   


    PD009163 - UPF0086 (Prodom link)

    Proteins where this domain is known:
    PFF1355w   


    PD009170 - SRP14 (Prodom link)

    Proteins where this domain is known:
    PFL0160w   


    PD009192 - Ribosomal_L36e (Prodom link)

    Proteins where this domain is known:
    PF11_0106   


    PD009396 - Ribosomal_L27e (Prodom link)

    Proteins where this domain is known:
    PF14_0579   


    PD009460 - G10 (Prodom link)

    Proteins where this domain is known:
    PFE1140c   


    PD009481 - Mago_nashi (Prodom link)

    Proteins where this domain is known:
    MAL7P1.139   


    PD009612 - Ribosomal_L6E (Prodom link)

    Proteins where this domain is known:
    PF13_0213   


    PD009649 - PurDNA_glycsylse (Prodom link)

    Proteins where this domain is known:
    PF14_0639   


    PD009671 - DUF51 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.172   


    PD009796 - UPF0023 (Prodom link)

    Proteins where this domain is known:
    PF14_0107   


    PD009834 - PD009834 (Prodom link)

    Proteins where this domain is known:
    PF08_0035    PFE0280c   


    PD009934 - Pox_mRNA-cap (Prodom link)

    Proteins where this domain is known:
    MAL13P1.311   


    PD010355 - SEC61_g_subunit (Prodom link)

    Proteins where this domain is known:
    PFB0450w   


    PD010361 - Ribosomal_L38e (Prodom link)

    Proteins where this domain is known:
    PF11_0312   


    PD010667 - UPF0099 (Prodom link)

    Proteins where this domain is known:
    PFD0355c   


    PD010724 - RNA_pol_Rpb8 (Prodom link)

    Proteins where this domain is known:
    PFL0665c   


    PD011090 - PD011090 (Prodom link)

    Proteins where this domain is known:
    PFI0215c   


    PD011819 - Armadillo (Prodom link)

    Proteins where this domain is known:
    PF13_0034   


    PD012151 - DNA_RNApol_7kD (Prodom link)

    Proteins where this domain is known:
    MAL13P1.213   


    PD012268 - Ribosomal_S25 (Prodom link)

    Proteins where this domain is known:
    PF14_0205   


    PD012670 - Ribosomal_L35AE (Prodom link)

    Proteins where this domain is known:
    PF11_0438   


    PD012963 - PD012963 (Prodom link)

    Proteins where this domain is known:
    PF14_0075    PF14_0076    PF14_0077    PF14_0078   


    PD012969 - DUF101 (Prodom link)

    Proteins where this domain is known:
    PF14_0269   


    PD013253 - PD013253 (Prodom link)

    Proteins where this domain is known:
    PF11_0218   


    PD013434 - Ribosomal_L5_mit (Prodom link)

    Proteins where this domain is known:
    PF07_0079   


    PD014111 - Pox_I1_rel (Prodom link)

    Proteins where this domain is known:
    PFL0405w   


    PD014904 - PD014904 (Prodom link)

    Proteins where this domain is known:
    PF10_0252   


    PD015172 - Cyt_c_ox6B (Prodom link)

    Proteins where this domain is known:
    PFI1375w   


    PD016033 - PD016033 (Prodom link)

    Proteins where this domain is known:
    MAL7P1.37   


    PD016494 - PD016494 (Prodom link)

    Proteins where this domain is known:
    PF08_0004    PF11_0224   


    PD017661 - Q7T9T3_GVAO_Q7T9T3; (Prodom link)

    Proteins where this domain is known:
    PF13_0022   


    PD018366 - PD018366 (Prodom link)

    Proteins where this domain is known:
    PF14_0502   


    PD019198 - Q7RE06_PLAYO_Q7RE06; (Prodom link)

    Proteins where this domain is known:
    PF14_0784   


    PD020235 - PD020235 (Prodom link)

    Proteins where this domain is known:
    PF13_0051   


    PD020287 - snRNP (Prodom link)

    Proteins where this domain is known:
    MAL13P1.253    MAL8P1.48    MAL8P1.9    PF08_0049    PF11_0255    PF11_0280    PF14_0146    PF14_0411    PFB0865w    PFL0460w   


    PD021457 - Gamma_adaptin_C (Prodom link)

    Proteins where this domain is known:
    PF14_0529   


    PD022844 - LFTR_PHOLL_Q7N6F5; (Prodom link)

    Proteins where this domain is known:
    PFB0585w   


    PD024360 - Surf1 (Prodom link)

    Proteins where this domain is known:
    PFE1550w   


    PD025234 - PD025234 (Prodom link)

    Proteins where this domain is known:
    PF11_0207   


    PD027623 - PD027623 (Prodom link)

    Proteins where this domain is known:
    PF10_0311   


    PD030375 - PD030375 (Prodom link)

    Proteins where this domain is known:
    PFI0715w   


    PD031131 - PD031131 (Prodom link)

    Proteins where this domain is known:
    PFA0480w   


    PD032841 - PD032841 (Prodom link)

    Proteins where this domain is known:
    PFE0065w   


    PD034433 - PD034433 (Prodom link)

    Proteins where this domain is known:
    PF13_0067   


    PD034736 - Metdp_prot_hydro (Prodom link)

    Proteins where this domain is known:
    PFF1295w   


    PD035217 - VG56_BPT4_P39262; (Prodom link)

    Proteins where this domain is known:
    PFA0650w   


    PD036877 - PD036877 (Prodom link)

    Proteins where this domain is known:
    PF11_0229   


    PD042406 - PD042406 (Prodom link)

    Proteins where this domain is known:
    MAL7P1.167    PF14_0084    PF14_0132    PFB0315w    PFL1805c   


    PD042831 - PD042831 (Prodom link)

    Proteins where this domain is known:
    PFF0565c   


    PD054980 - PD054980 (Prodom link)

    Proteins where this domain is known:
    PF11_0199   


    PD067159 - Urm1 (Prodom link)

    Proteins where this domain is known:
    PF11_0393   


    PD069703 - Q8ILC7_PLAF7_Q8ILC7; (Prodom link)

    Proteins where this domain is known:
    PF14_0317   


    PD075190 - PD075190 (Prodom link)

    Proteins where this domain is known:
    PFB0765w   


    PD088957 - DUF327 (Prodom link)

    Proteins where this domain is known:
    PF11_0218   


    PD091196 - PD091196 (Prodom link)

    Proteins where this domain is known:
    PFF0990c   


    PD098974 - PD098974 (Prodom link)

    Proteins where this domain is known:
    PFA0510w   


    PD102835 - PD102835 (Prodom link)

    Proteins where this domain is known:
    PFL0755c   


    PD111724 - PD111724 (Prodom link)

    Proteins where this domain is known:
    PF08_0064   


    PD115089 - PD115089 (Prodom link)

    Proteins where this domain is known:
    PFF1470c   


    PD115092 - PD115092 (Prodom link)

    Proteins where this domain is known:
    PF11_0482   


    PD125763 - PD125763 (Prodom link)

    Proteins where this domain is known:
    PF13_0218    PFI1710w   


    PD133308 - PD133308 (Prodom link)

    Proteins where this domain is known:
    PF08_0006   


    PD135598 - PD135598 (Prodom link)

    Proteins where this domain is known:
    PF14_0640   


    PD149633 - DNA_gyrase_B (Prodom link)

    Proteins where this domain is known:
    PF14_0316    PFL1915w   


    PD149806 - TMP_synthase (Prodom link)

    Proteins where this domain is known:
    PFF0680c   


    PD153432 - Csurface_antigen (Prodom link)

    Proteins where this domain is known:
    MAL13P1.158    MAL8P1.24    PF07_0004    PFE0070w    PFE1325w   


    PD157043 - Ribosomal_S15_b (Prodom link)

    Proteins where this domain is known:
    PF11_0072    PF13_0059   


    PD170000 - PD170000 (Prodom link)

    Proteins where this domain is known:
    PF11_0216   


    PD181906 - Q73MC9_TREDE_Q73MC9; (Prodom link)

    Proteins where this domain is known:
    PFD0320c   


    PD186100 - IF2 (Prodom link)

    Proteins where this domain is known:
    PF08_0018    PFE0830c    PFF0345w   


    PD208305 - PD208305 (Prodom link)

    Proteins where this domain is known:
    PFD0320c    PFI0245c   


    PD214932 - PD214932 (Prodom link)

    Proteins where this domain is known:
    MAL8P1.101   


    PD215945 - PD215945 (Prodom link)

    Proteins where this domain is known:
    PFE0300c   


    PD275159 - Q8LAA4_ARATH_Q8LAA4; (Prodom link)

    Proteins where this domain is known:
    MAL8P1.300   


    PD311402 - UPF0108 (Prodom link)

    Proteins where this domain is known:
    PF07_0103   


    PD317567 - PD317567 (Prodom link)

    Proteins where this domain is known:
    PF14_0637    PFD0320c   


    PD337143 - PD337143 (Prodom link)

    Proteins where this domain is known:
    PF13_0296   


    PD350662 - Peptidase_C12 (Prodom link)

    Proteins where this domain is known:
    PF11_0177    PF14_0576   


    PD351532 - Ac_coA_bind_prot (Prodom link)

    Proteins where this domain is known:
    PF08_0099    PF10_0015    PF10_0016    PF14_0749   


    PD363422 - Mov34-1 (Prodom link)

    Proteins where this domain is known:
    MAL13P1.343    PFI0630w    PFI0895c   


    PD802108 - PD802108 (Prodom link)

    Proteins where this domain is known:
    MAL7P1.92   


    PIRSF000102 - Lac_mal_DH (Pirsf link)

    Interpro entry IPR001557 : L-lactate/malate dehydrogenase (Interpro link)

    Interpro description:

    This family contains both lactate and malate dehydrogenases. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.

    L-lactate dehydrogenase (LDH) catalyses the reversible NAD-dependent interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in prokaryotic and eukaryotic organisms. In vertebrates there are three isozymes of LDH: the M form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as epsilon-crystallin.

    L-2-hydroxyisocaproate dehydrogenase (L-hicDH) catalyses the reversible and stereospecific interconversion between 2-ketocarboxylic acids and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's.

    Proteins where this domain is known:
    PF13_0141    PF13_0144    PFF0895w   


    PIRSF000108 - IDH_NADP (Pirsf link)

    Interpro entry IPR004790 : Isocitrate dehydrogenase NADP-dependent, eukaryotic (Interpro link)

    Interpro description:

    Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyzes the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.

    The eukaryotic, NADP-dependent isocitrate dehydrogenases, are defined by this group that includes the cytosolic, mitochondrial, and chloroplast enzymes, but does also hit a small number of bacterial proteins.

    Proteins where this domain is known:
    PF13_0242   


    PIRSF000114 - Glycerol-3-P_dh (Pirsf link)

    Interpro entry IPR006168 : NAD-dependent glycerol-3-phosphate dehydrogenase (Interpro link)

    Interpro description:
    NAD-dependent glycerol-3-phosphate dehydrogenase (GPD) catalyzes the reversible reduction of dihydroxyacetone phosphate to glycerol-3-phosphate. It is a cytoplasmic protein, active as a homodimer, each monomer containing an N-terminal NAD binding site. In insects, it acts in conjunction with a mitochondrial alpha-glycerophosphate oxidase in the alpha-glycerophosphate cycle, which is essential for the production of energy used in insect flight.

    Proteins where this domain is known:
    PF11_0157    PFL0780w   


    PIRSF000130 - IMPDH (Pirsf link)

    Interpro entry IPR018529 : IMP dehydrogenase related (Interpro link)

    Interpro description:

    Synonyms: Inosine-5'-monophosphate dehydrogenase, Inosinic acid dehydrogenase

    IMP dehydrogenase (MPDH) catalyzes the rate-limiting reaction of de novo GTP biosynthesis, the NAD-dependent reduction of IMP into XMP.

                       Inosine 5-phosphate + NAD+ + H2O = xanthosine 5-phosphate + NADH 

    IMP dehydrogenase is associated with cell proliferation and is a possible target for cancer chemotherapy. Mammalian and bacterial IMPDHs are tetramers of identical chains. There are two IMP dehydrogenase isozymes in humans. IMP dehydrogenase nearly always contains a long insertion that has two CBS domains within it and adopts a TIM barrel structure.

    Proteins where this domain is known:
    PFI1020c   


    PIRSF000149 - GAP_DH (Pirsf link)

    Interpro entry IPR000173 : Glyceraldehyde 3-phosphate dehydrogenase (Interpro link)

    Interpro description:

    Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme.

    GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.

    GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington's disease, Alzheimer's disease, Parkinson's disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins.

    Proteins where this domain is known:
    PF14_0598   


    PIRSF000157 - Oxoglu_dh_E1 (Pirsf link)

    Interpro entry IPR011603 : 2-oxoglutarate dehydrogenase, E1 component (Interpro link)

    Interpro description:

    2-oxoglutarate dehydrogenase is a key enzyme in the TCA cycle, converting 2-oxoglutarate, coenzyme A and NAD(+) to succinyl-CoA, NADH and carbon dioxide. This activity of this enzyme is tightly regulated and it is a major determinant of the metabolic flux through the TCA cycle. This enzyme is composed of multiple copies of three different subunits: 2-oxoglutarate dehydrogenase (E1), dihydrolipoamide succinyltransferase (E2) and lipoamide dehydrogenase (E3) which is often shared with similar enzymes such as pyruvate dehydrogenase. The E2 component forms a large multimeric core which binds the peripheral E1 and E3 subunits. The substrate is transferred between the active sites of the different subunits by a lipoyl moiety, bound to a lysine residue from the E2 polypeptide.

    This entry represents the E1 subunit of 2-oxoglutarate dehydrogenase. It catalyses the decarboxylation of this compound in a thiamine pyrophosphate-dependent manner, transferring the resultant succinyl group onto the liposyl moiety bound to the E2 subunit. The E1 ortholog from Corynebacterium glutamicum (Brevibacterium flavum) is unusual in having an N-terminal extension that resembles the E2 component of 2-oxoglutarate dehydrogenase enzyme.

    Proteins where this domain is known:
    PF08_0045   


    PIRSF000185 - Glu_DH (Pirsf link)

    Interpro entry IPR014362 : Glutamate dehydrogenase (Interpro link)

    Interpro description:

    This entry represents a glutamate dehydrogenase.

    Proteins where this domain is known:
    PF14_0164    PF14_0286   


    PIRSF000193 - Pyrrol-5-carb_rd (Pirsf link)

    Interpro entry IPR000304 : Delta 1-pyrroline-5-carboxylate reductase (Interpro link)

    Interpro description:
    Delta 1-pyrroline-5-carboxylate reductase (P5CR) is the enzyme that catalyzes the terminal step in the biosynthesis of proline from glutamate, the NAD(P) dependent oxidation of 1-pyrroline-5-carboxylate into proline.

    Proteins where this domain is known:
    MAL13P1.284   


    PIRSF000303 - Glutathion_perox (Pirsf link)

    Interpro entry IPR000889 : Glutathione peroxidase (Interpro link)

    Interpro description:

    Glutathione peroxidase (GSHPx) is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function has been shown to be evolutionary related to those of GSHPx's.

    In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine.

    Proteins where this domain is known:
    PFL0595c   


    PIRSF000349 - SODismutase (Pirsf link)

    Interpro entry IPR001189 : Manganese and iron superoxide dismutase (Interpro link)

    Interpro description:

    Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.

    The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).

    Proteins where this domain is known:
    PF08_0071   


    PIRSF000361 - Frd-NADP+_RD (Pirsf link)

    Interpro entry IPR012146 : Ferredoxin--NADP reductase (Interpro link)

    Interpro description:

    Ferredoxin-NADP+ reductase (FNR) is one of several soluble partners that can receive an electron from ferredoxin once it has been reduced by photosystem I in chloroplasts and cyanobacteria. FNR catalyses the reduction of NADP+ to NADPH, using the electrons provided by the reduced ferredoxin, with the aid of a FAD cofactor.

    Proteins where this domain is known:
    PFF1115w   


    PIRSF000389 - DHFR-TS (Pirsf link)

    Interpro entry IPR012262 : Bifunctional dihydrofolate reductase/thymidylate synthase (Interpro link)

    Interpro description:

    This group represents a bifunctional dihydrofolate reductase/thymidylate synthase found in some plant species and protozoal parasites including malarial species and trypanosomes. In other species dihydrofolate reductase and thymidilate synthase are encoded on separate polypeptides.

    Thymidylate synthase catalyzes the reductive methylation of dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate to dihydrofolate:

     5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMP 
    This provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is important for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process can cause various biological and genetic abnormalities. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.

    Dihydrofolate reductase (DHFR) catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate:

    5,6,7,8-tetrahydrofolate + NADP+ = 7,8-dihydrofolate + NADPH + H+
    This is an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis), and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood.

    As this enzyme is essential in both nucleic acid and amino acid biosynthesis, it is an important target of antiparasitic drugs. Resistance to antimalarial drugs that target this enzyme is often due to mutations that prevent drug binding but maintain enzyme activity. The structure of the wild-type and drug resistant malarial enzymes provides insights into the development of resistance and suggests approaches for the design of new drugs against this target.

    Proteins where this domain is known:
    PFD0830w   


    PIRSF000412 - SHMT (Pirsf link)

    Interpro entry IPR001085 : Glycine hydroxymethyltransferase (Interpro link)

    Interpro description:
    Synonym(s): Serine hydroxymethyltransferase, Serine aldolase, Threonine aldolase

    Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) dependent enzyme and belongs to the aspartate aminotransferase superfamily (fold type I). The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme. The enzyme carries out interconversion of serine and glycine using PLP as the cofactor. SHMT catalyses the transfer of a hydroxymethyl group from N5, N10- methylene tetrahydrofolate to glycine, resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers and the mammalian enzyme forms a homotetramer. PLP dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalysed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), D-amino acid superfamily (fold type IV) and glycogen phophorylase family (fold type V).

    In vertebrates, glycine hydroxymethyltransferase exists in a cytoplasmic and a mitochondrial form whereas only one form is found in prokaryotes.

    Proteins where this domain is known:
    PFL1720w   


    PIRSF000429 - Ac-CoA_Ac_transf (Pirsf link)

    Interpro entry IPR002155 : (Interpro link)

    Interpro description:

    Two different types of thiolase are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase and 3-ketoacyl-CoA thiolase. 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.

    In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.

    There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.

    Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases.

    Proteins where this domain is known:
    PF14_0484   


    PIRSF000431 - Glycerol-3-P_O-acyltransfrase (Pirsf link)

    Interpro entry IPR016222 : Glycerol-3-phosphate O-acyltransferase (Interpro link)

    Interpro description:

    This group represents a glycerol-3-phosphate O-acyltransferase.

    Proteins where this domain is known:
    PF13_0100   


    PIRSF000497 - MAT (Pirsf link)

    Interpro entry IPR002133 : S-adenosylmethionine synthetase (Interpro link)

    Interpro description:

    S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.

    In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.

    The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.

    Proteins where this domain is known:
    PFI1090w   


    PIRSF000513 - Thz_kinase (Pirsf link)

    Interpro entry IPR011144 : Hydroxyethylthiazole kinase, monofunctional (Interpro link)

    Interpro description:

    This group represents a predicted hydroxyethylthiazole kinase. THZ kinase activity is involved in the salvage synthesis of TH-P from the thiazole:

     2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphate 
    Hydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate.

    Proteins where this domain is known:
    PFL1920c   


    PIRSF000548 - PK_regulatory (Pirsf link)

    Interpro entry IPR012198 : cAMP-dependent protein kinase regulatory subunit (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    In the absence of cAMP, protein kinase A (PKA) exists as an equimolar tetramer of regulatory (R) and catalytic (C) subunits. In addition to its role as an inhibitor of the C subunit, the R subunit anchors the holoenzyme to specific intracellular locations and prevents the C subunit from entering the nucleus. Typical R subunits have a conserved domain structure, consisting of the N-terminal dimerisation domain, inhibitory region, cAMP-binding domain A and cAMP-binding domain B. R subunits interact with C subunits primarily through the inhibitory site. The cAMP-binding domains show extensive sequence similarity and bind cAMP cooperatively.

    On the basis of phylogenetic trees generated from multiple sequence alignment of complete sequences, this family was divided into four sub-families, types I to IV. Types I and II, found in animals, differ in molecular weight, sequence, autophosphorylation capability, cellular location and tissue distribution. Types I and II are further sub-divided into alpha and beta subtypes, based mainly on sequence similarity. Type III are from fungi and type IV are from alveolates.

    Proteins where this domain is known:
    PFL1110c   


    PIRSF000747 - RPB5 (Pirsf link)

    Interpro entry IPR014381 : (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    This entry represents a DNA-directed RNA polymerase, RPB5 subunit.

    Proteins where this domain is known:
    PF13_0341   


    PIRSF000779 - RNA_pol_Rpb8 (Pirsf link)

    Interpro entry IPR005570 : RNA polymerase, Rpb8 (Interpro link)

    Interpro description:
    Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits.

    Proteins where this domain is known:
    PFL0665c   


    PIRSF000848 - CDP_diag_ino_3_P (Pirsf link)

    Interpro entry IPR014387 : (Interpro link)

    Interpro description:

    This entry represents a CDP-diacylglycerol-inositol 3-phosphatidyltransferase.

    Proteins where this domain is known:
    MAL13P1.82   


    PIRSF000868 - 14-3-3 (Pirsf link)

    Interpro entry IPR000308 : 14-3-3 protein (Interpro link)

    Interpro description:

    The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.

    14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

    Proteins where this domain is known:
    MAL13P1.309    MAL8P1.69   


    PIRSF001109 - Ad_hcy_hydrolase (Pirsf link)

    Interpro entry IPR000043 : S-adenosyl-L-homocysteine hydrolase (Interpro link)

    Interpro description:
    S-adenosyl-L-homocysteine hydrolase (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein of about 430 to 470 amino acids. The family contains a glycine-rich region in the central part of AdoHcyase; a region thought to be involved in NAD-binding.

    Proteins where this domain is known:
    PFE1050w   


    PIRSF001237 - DHOdimr (Pirsf link)

    Interpro entry IPR004721 : Dihydroorotase homodimeric type (Interpro link)

    Interpro description:

    Dihydroorotase belongs to MEROPS peptidase family M38 (clan MJ), where it is classified as a non-peptidase homologue. DHOase catalyses the third step in the de novo biosynthesis of pyrimidine, the conversion of ureidosuccinic acid (N-carbamoyl-L-aspartate) into dihydroorotate. Dihydroorotase binds a zinc ion which is required for its catalytic activity.

    In bacteria, DHOase is a dimer of identical chains of about 400 amino-acid residues (gene pyrC). In the metazoa, DHOase is part of a large multi-functional protein known as 'rudimentary' in Drosophila melanogaster and CAD in mammals and which catalyzes the first three steps of pyrimidine biosynthesis. The DHOase domain is located in the central part of this polyprotein. In yeast, DHOase is encoded by a monofunctional protein (gene URA4). However, a defective DHOase domain is found in a multifunctional protein (gene URA2) that catalyzes the first two steps of pyrimidine biosynthesis.

    The comparison of DHOase sequences from various sources shows that there are two highly conserved regions. The first located in the N-terminal extremity contains two histidine residues suggested to be involved in binding the zinc ion. The second is found in the C-terminal part. Members of this family of proteins are predicted to adopt a TIM barrel fold.

    This family represents the homodimeric form of dihydroorotase It is found in bacteria, plants and fungi; URA4 of yeast is a member of this group of sequences.

    Proteins where this domain is known:
    PF14_0697   


    PIRSF001265 - H+-PPase (Pirsf link)

    Interpro entry IPR004131 : Inorganic H+ pyrophosphatase (Interpro link)

    Interpro description:

    Two types of proteins that hydrolyse inorganic pyrophosphate (PPi), very different in both amino acid sequence and structure, have been characterised to date: soluble and membrane-bound proton-pumping pyrophosphatases (sPPases and H(+)-PPases, respectively). sPPases are ubiquitous proteins that hydrolyse PPi to release heat, whereas H+-PPases, so far unidentified in animal and fungal cells, couple the energy of PPi hydrolysis to proton movement across biological membranes. The latter type is represented by this group of proteins. H+-PPases are also called vacuolar-type inorganic pyrophosphatases (V-PPase) or pyrophosphate-energised vacuolar membrane proton pumps. In plants, vacuoles contain two enzymes for acidifying the interior of the vacuole, the V-ATPase and the V-PPase (V is for vacuolar).

    Two distinct biochemical subclasses of H+-PPases have been characterised to date: K+-stimulated and K+-insensitive.

    For additional information please see.

    Proteins where this domain is known:
    PF14_0541   


    PIRSF001357 - DeoC (Pirsf link)

    Interpro entry IPR011343 : Deoxyribose-phosphate aldolase (Interpro link)

    Interpro description:

    Class I aldolases catalyse carbon-carbon bond formation using a 'Schiff base' mechanism. This entry represents deoxyribose-phosphate aldolase, a widely distributed enzyme, which catalyses the following reversible reaction:

     2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehyde
    While the physiological role of this enzyme remains unknown in eukaryotes, in prokaroytes it is thought to function in the catabolism of deoxyribonucleotides.

    In all studied structures, the deoxyribose-phophate aldolase subunits adopt the classical eight-bladed TIM barrel fold. The oligomerisation state of the enzyme appears to depend on the living temperature of the organism - the Escherichia coli enzyme is a homodimer, while the enzymes from the thermophilic microorganisms Thermus thermophilus and Aeropyrum pernix are homotetramers. The degree of oligomerisation does not, however, appear to affect catalysis.

    Proteins where this domain is known:
    PF10_0210   


    PIRSF001400 - Enolase (Pirsf link)

    Interpro entry IPR000941 : Enolase (Interpro link)

    Interpro description:

    Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.

    Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.

    Proteins where this domain is known:
    PF10_0155   


    PIRSF001529 - Ser-tRNA-synth_IIa (Pirsf link)

    Interpro entry IPR002317 : Seryl-tRNA synthetase, class IIa (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Seryl-tRNA synthetase exists as monomer and belongs to class IIa.

    Proteins where this domain is known:
    PF07_0073   


    PIRSF001558 - GSHase (Pirsf link)

    Interpro entry IPR005615 : Glutathione synthase, eukaryotic (Interpro link)

    Interpro description:

    This entry represents eukaryotic glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system.

    Proteins where this domain is known:
    PFE0605c   


    PIRSF002116 - Ribosomal_S4 (Pirsf link)

    Interpro entry IPR000876 : Ribosomal protein S4e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and mammalian and plant cytoplasmic S4. Two highly similar isoforms of mammalian S4 exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have 233 to 264 amino acids.

    Proteins where this domain is known:
    PF11_0065   


    PIRSF002131 - Ribosomal_S11 (Pirsf link)

    Interpro entry IPR001971 : Ribosomal protein S11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.

    Proteins where this domain is known:
    PFE0810c   


    PIRSF002133 - Ribosomal_S12/S23 (Pirsf link)

    Interpro entry IPR006032 : Ribosomal protein S12/S23 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This protein is known typically as S12 in bacteria, S23 in eukaryotes and as either S12 or S23 in the Archaea.

    Bacterial S12 molecules contain a conserved aspartic acid residue which undergoes a novel post-translational modification, beta-methylthiolation, to form the corresponding 3-methylthioaspartic acid.

    Proteins where this domain is known:
    PFC0290w   


    PIRSF002134 - Ribosomal_S13 (Pirsf link)

    Interpro entry IPR001892 : Ribosomal protein S13 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. It is a basic protein of 115 to 177 amino-acid residues. This family of ribosomal proteins is present in procaryotes and eukaryotes.

    Proteins where this domain is known:
    PF11_0272   


    PIRSF002144 - Ribosomal_S19 (Pirsf link)

    Interpro entry IPR002222 : Ribosomal protein S19/S15 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).

    Proteins where this domain is known:
    MAL13P1.92   


    PIRSF002148 - Ribosomal_S21e (Pirsf link)

    Interpro entry IPR001931 : Ribosomal protein S21e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 82 to 87 amino acids. The amino termini are all N alpha-acetylated. The N-terminal halves of the protein molecules are highly conserved in contrast to the carboxy-terminal parts.

    Proteins where this domain is known:
    PF11_0454   


    PIRSF002155 - Ribosomal_L1 (Pirsf link)

    Interpro entry IPR002143 : Ribosomal protein L1 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L1 is the largest protein from the large ribosomal subunit. The L1 protein contains two domains: 2-layer alpha/beta domain and a 3-layer alpha/beta domain (interrupts the first domain). In Escherichia coli, L1 is known to bind to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    Proteins where this domain is known:
    PF14_0391   


    PIRSF002158 - Ribosomal_L2 (Pirsf link)

    Interpro entry IPR002171 : Ribosomal protein L2 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    Proteins where this domain is known:
    PF11_0337    PFE0845c   


    PIRSF002161 - Ribosomal_L5 (Pirsf link)

    Interpro entry IPR002132 : Ribosomal protein L5 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    L5 is a protein of about 180 amino-acid residues.

    Proteins where this domain is known:
    PF07_0079   


    PIRSF002162 - Ribosomal_L6 (Pirsf link)

    Interpro entry IPR000702 : Ribosomal protein L6 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.

    Proteins where this domain is known:
    PF13_0129   


    PIRSF002181 - Ribosomal_L13 (Pirsf link)

    Interpro entry IPR005822 : Ribosomal protein L13 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.

    Proteins where this domain is known:
    PF10_0043    PFB0645c   


    PIRSF002290 - Clathrin_H_chain (Pirsf link)

    Interpro entry IPR016341 : (Interpro link)

    Interpro description:

    This group represents a clathrin heavy chain.

    Proteins where this domain is known:
    PFL0930w   


    PIRSF002291 - AP_complex_beta (Pirsf link)

    Interpro entry IPR016342 : Adaptor protein complex, beta subunit (Interpro link)

    Interpro description:
    The adaptor protein complexes mediate both the recruitment of clathrin to membranes and the recognition of sorting signals within the cytosolic tails of transmembrane cargo molecules. Adaptor protein complex 1 (AP-1) is a heterotetramer composed of two large adaptins (gamma-type subunit AP1G1 and beta-type subunit AP1B1), a medium adaptin (mu-type subunit AP1M1 or AP1M2) and a small adaptin (sigma-type subunit AP1S1 or AP1S2 or AP1S3). Subunits of clathrin-associated adaptor protein complex 1 play a role in protein sorting in the late-Golgi/trans-Golgi network (TGN) and/or in endosomes.

    This group represents an adaptor protein complex, beta subunit.

    Proteins where this domain is known:
    MAL7P1.164    PFE1400c   


    PIRSF002583 - Hsp90 (Pirsf link)

    Interpro entry IPR001404 : Heat shock protein Hsp90 (Interpro link)

    Interpro description:

    Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.

    The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.

    Proteins where this domain is known:
    PF07_0029    PFL1070c   


    PIRSF003025 - eIF5A (Pirsf link)

    Interpro entry IPR001884 : Eukaryotic initiation factor 5A hypusine (eIF-5A) (Interpro link)

    Interpro description:

    Translation initiation factor 5A (IF-5A) is reported to be involved in the first step of peptide bond formation in translation, to be involved in cell-cycle regulation and to be a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively. IF-5A contains an unusual amino acid, hypusine N-epsilon-(4-aminobutyl-2-hydroxy)lysine), that is required for its function. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported.

    The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding.

    Proteins where this domain is known:
    PFL0210c   


    PIRSF003113 - BolA (Pirsf link)

    Interpro entry IPR002634 : (Interpro link)

    Interpro description:
    This family consist of the morpho-protein BolA from Escherichia coli and its various homologs. In E. coli, over-expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5.

    Proteins where this domain is known:
    PFE0790c   


    PIRSF003575 - MSA_2 (Pirsf link)

    Interpro entry IPR001136 : Merozoite surface antigen 2 (MSA-2) (Interpro link)

    Interpro description:
    The merozoite surface antigen 2 (MSA-2) may play a role in the merozoite attachment to the erythrocyte. It is thought to be attached to the membrane by a GPI-anchor.

    Proteins where this domain is known:
    PFB0300c   


    PIRSF004499 - SUI1_euk (Pirsf link)

    Interpro entry IPR005874 : Eukaryotic translation initiation factor SUI1 (Interpro link)

    Interpro description:

    Cells have evolved elaborate mechanisms to rid themselves of aberrant proteins and transcripts. The nonsense-mediated mRNA decay pathway (NMD) is an example of a pathway that eliminates aberrant mRNAs. In addition to its role in recognition of the AUG codon during translation initiation and maintenance of the appropriate reading frame during translation elongation by directing the ribosome to the proper start site of translation by functioning in concert with eIF-2 and the initiator tRNA-Met, the SUI1 protein plays a role in the NMD pathway.

    Proteins where this domain is known:
    PFL2095w   


    PIRSF004557 - SecY (Pirsf link)

    Interpro entry IPR002208 : SecY protein (Interpro link)

    Interpro description:

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.

    The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.

    The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.

    Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.

    Proteins where this domain is known:
    MAL13P1.231   


    PIRSF004749 - Pep_def (Pirsf link)

    Interpro entry IPR000181 : Formylmethionine deformylase (Interpro link)

    Interpro description:

    Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::

     N-formyl-L-methionine + H2O = formate + methionyl peptide 
    Catalytic efficiency strongly depends on the identity of the bound metal.

    The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .

    Proteins where this domain is known:
    PFI0380c   


    PIRSF004848 - YBL036c_PLPDEIII (Pirsf link)

    Interpro entry IPR011078 : (Interpro link)

    Interpro description:

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.

    Proteins in this entry occur in archaea, bacteria and eukaryotes. They are encoded by genes which are often co-transcribed with proline biosysnthesis genes, although their function in vivo has not yet been demonstrated.

    The structure of the yeast protein YBL036C has been determined to a resolution of 2.0 A. Similar in structure to the N-terminal domains of alanine racemase and ornithine decarboxylase, it forms a TIM barrel fold which begins with a long N-terminal helix, rather than the classical beta strand found at the beginning of most other TIM barrels. Unlike alanine racemase and ornithine decarboxylase, which are two-domain dimeric proteins, the yeast protein is a single domain monomer. A pyridoxal 5'-phosphate cofactor is covalently bound towards the C-terminal end of the barrel, which is the usual active site in TIM-barrel folds. Some racemase activity was observed for this protein and it was suggested by the authors that it may function as a general racemase.

    Proteins where this domain is known:
    PFI0965w   


    PIRSF004967 - DPH1 (Pirsf link)

    Interpro entry IPR016435 : (Interpro link)

    Interpro description:

    This group represents a diphthamide biosynthesis protein 1.

    Proteins where this domain is known:
    PF14_0136   


    PIRSF005067 - Tma_RNA-bind_prd (Pirsf link)

    Interpro entry IPR016437 : (Interpro link)

    Interpro description:

    This group represents a predicted translation machinery-associated RNA binding protein.

    Proteins where this domain is known:
    PFE1470w   


    PIRSF005198 - Antiviral_helicase_SKI2 (Pirsf link)

    Interpro entry IPR016438 : (Interpro link)

    Interpro description:

    This group represents a group of ATP-dependent RNA helicases including the antiviral protein SK12 and DOB1, which is involved in 3' end formation of rRNA and mRNA transport.

    Proteins where this domain is known:
    PFF0100w   


    PIRSF005413 - COX11 (Pirsf link)

    Interpro entry IPR007533 : Cytochrome c oxidase assembly protein CtaG/Cox11 (Interpro link)

    Interpro description:
    Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C-terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae . Met 224 is also thought to play a role in copper transfer or stabilising the copper site.

    Proteins where this domain is known:
    PF14_0721   


    PIRSF005461 - 23S_rRNA_mtase (Pirsf link)

    Interpro entry IPR016448 : 23S ribosomal RNA methyltransferase (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This group represents a 23S ribosomal RNA methyltransferase.

    Proteins where this domain is known:
    PF13_0052   


    PIRSF005567 - Coatomer_beta'_subunit (Pirsf link)

    Interpro entry IPR016453 : (Interpro link)

    Interpro description:

    This group represents a coatomer, beta' subunit.

    Proteins where this domain is known:
    PFI0290c   


    PIRSF005590 - Ribosomal_L10 (Pirsf link)

    Interpro entry IPR001197 : Ribosomal protein L10e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A variety of eukaryotic and plant ribosomal L10e proteins can be grouped. This family consists of vertebrate L10 (QM), plant L10, Caenorhabditis elegans L10, yeast L10 (QSR1) and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0543.

    Proteins where this domain is known:
    PF14_0141   


    PIRSF005653 - RNA_pol_N/8_sub (Pirsf link)

    Interpro entry IPR000268 : RNA polymerases, N/8 Kd subunits (Interpro link)

    Interpro description:
    In eukaryotes, there are three different forms of DNA-dependent RNA polymerases transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. Archaebacterial subunit N (gene rpoN) is a small protein of about 8 kDa, it is evolutionary related to a 8.3 kDa component shared by all three forms of eukaryotic RNA polymerases (gene RPB10 in yeast and POLR2J in mammals) as well as to African swine fever virus (ASFV) protein CP80R. There is a conserved region which is located at the N-terminal extremity of these polymerase subunits; this region contains two cysteines that binds a zinc ion.

    Proteins where this domain is known:
    PF07_0027   


    PIRSF005856 - Rad51 (Pirsf link)

    Interpro entry IPR016467 : (Interpro link)

    Interpro description:

    This group represents a number of eukaryotic and archaeal DNA repair and recombination proteins which are homologous to the bacterial protein RecA.

    The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response.

    Proteins where this domain is known:
    MAL8P1.76    PF11_0087   


    PIRSF005963 - Lipoyl_synth (Pirsf link)

    Interpro entry IPR003698 : Lipoate synthase (Interpro link)

    Interpro description:
    Lipoic acid is a covalently bound disulphide-containing cofactor required for function of the pyruvate dehydrogenase, alpha-ketoglutarate dehydrogenase, and glycine cleavage enzyme complexes of Escherichia coli. Two genes, lipA and lipB, are involved in lipoic acid biosynthesis or metabolism. LipA is required for the insertion of the first sulphur into the octanoic acid backbone. LipB functions downstream of LipA, but its role in lipoic acid metabolism remains unclear. Lipoate synthase (or lipoic acid synthetase) catalyses the formation of alpha-(+)-lipoic acid, required for lipoate biosynthesis.

    Proteins where this domain is known:
    MAL13P1.220   


    PIRSF005992 - Clathrin_mu (Pirsf link)

    Interpro entry IPR001392 : Clathrin adaptor, mu subunit (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents the mu subunit of various clathrin adaptors (AP1, AP2 and AP3). The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle .

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0202    PF13_0062   


    PIRSF006004 - CHP00048 (Pirsf link)

    Interpro entry IPR004383 : Conserved hypothetical protein CHP00048 (Interpro link)

    Interpro description:
    This family of conserved hypothetical proteins groups bacterial proteins of unknown function.

    Proteins where this domain is known:
    PF14_0066   


    PIRSF006294 - PEP_crbxkin (Pirsf link)

    Interpro entry IPR001272 : Phosphoenolpyruvate carboxykinase, ATP-utilising (Interpro link)

    Interpro description:

    Phosphoenolpyruvate carboxykinase (PEPCK) catalyses the first committed (rate-limiting) step in hepatic gluconeogenesis, namely the reversible decarboxylation of oxaloacetate to phosphoenolpyruvate (PEP) and carbon dioxide, using either ATP or GTP as a source of phosphate. The ATP-utilising and GTP-utilising enzymes form two divergent subfamilies, which have little sequence similarity but which retain conserved active site residues. ATP-utilising PEPCKs are monomers or oligomers of identical subunits found in certain bacteria, yeast, trypanosomatids, and plants, while GTP-utilising PEPCKs are mainly monomers found in animals and some bacteria. Both require divalent cations for activity, such as magnesium or manganese. One cation interacts with the enzyme at metal binding site 1 to elicit activation, while the second cation interacts at metal binding site 2 to serve as a metal-nucleotide substrate. In bacteria, fungi and plants, PEPCK is involved in the glyoxylate bypass, an alternative to the tricarboxylic acid cycle.

    PEPCK helps to regulate blood glucose levels. The rate of gluconeogenesis can be controlled through transcriptional regulation of the PEPCK gene by cAMP (the mediator of glucagon and catecholamines), glucocorticoids and insulin. In general, PEPCK expression is induced by glucagon, catecholamines and glucocorticoids during periods of fasting and in response to stress, but is inhibited by (glucose-induced) insulin upon feeding. With type II diabetes, this regulation system can fail, resulting in increased gluconeogenesis that in turn raises glucose levels.

    PEPCK consists of an N-terminal and a catalytic C-terminal domain, with the active site and metal ions located in a cleft between them. Both domains have an alpha/beta topology that is partly similar to one another. Substrate binding causes PEPCK to undergo a conformational change, which accelerates catalysis by forcing bulk solvent molecules out of the active site. PCK uses an alpha/beta/alpha motif for nucleotide binding, this motif differing from other kinase domains. GTP-utilising PEPCK has a PEP-binding domain and two kinase motifs to bind GTP and magnesium.

    This entry represents ATP-utilising phosphoenolpyruvate carboxykinase enzymes.

    Proteins where this domain is known:
    PF13_0234   


    PIRSF006398 - Sec61_beta_euk (Pirsf link)

    Interpro entry IPR016482 : (Interpro link)

    Interpro description:

    A conserved heterotrimeric integral membrane protein complex--the Sec61 complex (eukaryotes) or SecY complex (prokaryotes)--forms a protein-conducting channel that allows polypeptides to be transferred across (or integrated into) the endoplasmic reticulum (eukaryotes) or across the cytoplasmic membrane (prokaryotes). This complex is itself a part of a larger translocase complex.

    The alpha subunits, called Sec61alpha in mammals, Sec61p in Saccharomyces cerevisiae (Baker's yeast), and SecY in prokaryotes, and the gamma subunits, called Sec61gamma in mammals, Sss1p in S. cerevisiae, and SecE in prokaryotes, show significant sequence conservation. Both subunits are required for cell viability in S. cerevisiae and Escherichia coli. The beta subunits, called Sec61beta in mammals, Sbh in S. cerevisiae, and Sec-beta in archaea, are not essential for cell viability; they are similar in eukaryotes and archaea, but show no obvious homology to the corresponding SecG subunits in bacteria. SecY forms the channel pore, and it is the cross-linking partner of polypeptide chains passing through the membrane. SecY and SecE constitute the high-affinity SecA-binding site on the membrane.

    The channel is a passive conduit for polypeptides. It must therefore associate with other components that provide a driving force. The partner proteins in bacteria and eukaryotes differ. In bacteria, the translocase complex comprises 7 proteins, including a chaperone protein (SecB;, an ATPase (SecA;, an integral membrane complex (SecY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD;and SecF;. The SecA ATPase interacts dynamically with the SecYEG integral membrane components to drive the transmembrane movement of newly synthesized preproteins. In S. cerevisiae (and probably in all eukaryotes), the full translocase comprises another membrane protein subcomplex (the tetrameric Sec62/63p complex), and the lumenal protein BiP, a member of the Hsp70 family of ATPases. BiP promotes translocation by acting as a molecular ratchet, preventing the polypeptide chain from sliding back into the cytosol.

    Proteins where this domain is known:
    MAL8P1.51   


    PIRSF006413 - IF-6 (Pirsf link)

    Interpro entry IPR002769 : Translation initiation factor IF6 (Interpro link)

    Interpro description:

    This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.

    The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.

    Proteins where this domain is known:
    PF13_0178   


    PIRSF006487 - GcvT (Pirsf link)

    Interpro entry IPR006223 : Glycine cleavage system T protein (Interpro link)

    Interpro description:

    This is a subfamily of glycine cleavage T proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferasethat catalyses the following reaction:

     (6S)-tetrahydrofolate + S-aminomethyldihydrolipoylprotein = (6R)-5,10-methylenetetrahydrofolate + NH3 + dihydrolipoylprotein 

    Proteins where this domain is known:
    PF13_0345   


    PIRSF006540 - Nop17p (Pirsf link)

    Interpro entry IPR000692 : Fibrillarin (Interpro link)

    Interpro description:
    Fibrillarin is a component of a nucleolar small nuclear ribonucleoprotein (SnRNP), functioning in vivo in ribosomal RNA processing. It is associated with U3, U8 and U13 small nuclear RNAs in mammals and is similar to the yeast NOP1 protein. Fibrillarin has a well conserved sequence of around 320 amino acids, and contains 3 domains, an N-terminal Gly/Arg-rich region; a central domain resembling other RNA-binding proteins and containing an RNP-2-like consensus sequence; and a C-terminal alpha-helical domain. An evolutionarily related pre-rRNA processing protein, which lacks the Gly/Arg-rich domain, has been found in various archaebacteria.

    Proteins where this domain is known:
    PF14_0068   


    PIRSF006588 - TyrRS_arch_euk (Pirsf link)

    Interpro entry IPR016485 : (Interpro link)

    Interpro description:

    This group represents a tyrosine tRNA ligase, archaeal/eukaryotic types.

    Proteins where this domain is known:
    MAL8P1.125   


    PIRSF006609 - snRNP_SmF (Pirsf link)

    Interpro entry IPR016487 : (Interpro link)

    Interpro description:

    This group represents a small nuclear ribonucleoprotein SmF.

    Proteins where this domain is known:
    PF11_0280   


    PIRSF006621 - Dus (Pirsf link)

    Interpro entry IPR001269 : tRNA-dihydrouridine synthase (Interpro link)

    Interpro description:

    Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae (Baker's yeast) acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.

    Proteins where this domain is known:
    PF14_0086   


    PIRSF006641 - CHP00092 (Pirsf link)

    Interpro entry IPR004396 : Conserved hypothetical protein CHP00092 (Interpro link)

    Interpro description:
    This is a family of conserved hypothetical proteins found in both prokaryotes and eukaryotes. While the function of these proteins is not known, the crystal structure offrom Haemophilus influenzae has been determined. This protein consists of three domains: an N-terminal domain which has a mononucleotide binding fold typical for the P-loop NTPases, a central domain which forms an alpha-helical coiled coil, and a C-terminal domain composed of a six-stranded half-barrel curved around an alpha helix. The central and C-terminal domains are topologically similar to RNA-binding proteins, while the N-terminal region contains the features typical of GTP-dependent molecular switches. The purified protein was capable of binding both double-stranded nucleic acid and GTP. It was suggested, therefore, that this protein might be part of a nucleoprotein complex and could function as a GTP-dependent translation factor.

    Proteins where this domain is known:
    MAL7P1.122   


    PIRSF006704 - TF_IIS (Pirsf link)

    Interpro entry IPR016492 : Transcription elongation factor, IIS (Interpro link)

    Interpro description:

    This entry represents transcription elongation factors of the IIS type. TFIIS is a component of RNA polymerase II preinitiation complexes, and is required for preinitiation complex assembly and stability. The association of TFIIS with a promoter depends on functional preinitiation complex components including Mediator and the SAGA complex. TFIIS is composed of three domains: domain 1 forms a 4-helical bundle that appears to bind certain initiation factors; domain 2 forms a 3-helical bundle and is required for Pol II binding; domain 3 forms a zinc ribbon and is essential for stimulation of RNA cleavage.

    Proteins where this domain is known:
    PF07_0057   


    PIRSF008765 - PIG-P_GPI19 (Pirsf link)

    Interpro entry IPR016542 : (Interpro link)

    Interpro description:

    This group represents a phosphatidylinositol N-acetylglucosaminyltransferase, GPI19/PIG-P subunit.

    Proteins where this domain is known:
    PFI1705w   


    PIRSF008835 - TPR_repeat_11_Fis1 (Pirsf link)

    Interpro entry IPR016543 : (Interpro link)

    Interpro description:

    This group represents a mitochondrial fission 1 protein.

    Proteins where this domain is known:
    MAL13P1.139   


    PIRSF009449 - DNA_primase_large_subunit (Pirsf link)

    Interpro entry IPR016558 : (Interpro link)

    Interpro description:

    This group represents a DNA primase, large subunit.

    Proteins where this domain is known:
    PFI0530c   


    PIRSF010045 - DUF850_TM_euk (Pirsf link)

    Interpro entry IPR008568 : (Interpro link)

    Interpro description:
    This family consists of eukaryotic putative transmembrane proteins of unknown function.

    Proteins where this domain is known:
    MAL13P1.299   


    PIRSF010052 - Polyub_prc_Npl4 (Pirsf link)

    Interpro entry IPR016563 : (Interpro link)

    Interpro description:

    This group represents a polyubiquitin-tagged protein recognition complex, Npl4 component.

    Proteins where this domain is known:
    PFE0380c   


    PIRSF015578 - Myoinos-ppht_syn (Pirsf link)

    Interpro entry IPR002587 : Myo-inositol-1-phosphate synthase (Interpro link)

    Interpro description:

    1L-myo-Inositol-1-phosphate synthase catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.

    In Saccharomyces cerevisiae (Baker's yeast), the transcriptional regulation of the INO1 gene has been studied in detail and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant).

    Proteins where this domain is known:
    PFE0585c   


    PIRSF015588 - AP_complex_sigma (Pirsf link)

    Interpro entry IPR016635 : Adaptor protein complex, sigma subunit (Interpro link)

    Interpro description:
    The adaptor protein complexes mediate both the recruitment of clathrin to membranes and the recognition of sorting signals within the cytosolic tails of transmembrane cargo molecules. Adaptor protein complex 1 (AP-1) is a heterotetramer composed of two large adaptins (gamma-type subunit AP1G1 and beta-type subunit AP1B1), a medium adaptin (mu-type subunit AP1M1 or AP1M2) and a small adaptin (sigma-type subunit AP1S1 or AP1S2 or AP1S3). Subunits of clathrin-associated adaptor protein complex 1 play a role in protein sorting in the late-Golgi/trans-Golgi network (TGN) and/or in endosomes.

    This group represents an adaptor protein complex, sigma subunit.

    Proteins where this domain is known:
    PF11_0187    PFB0805c    PFD1090c    PFL2425w   


    PIRSF015730 - TFAR19 (Pirsf link)

    Interpro entry IPR002836 : DNA-binding TFAR19-related protein (Interpro link)

    Interpro description:

    This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.

    Proteins where this domain is known:
    PFI0450c   


    PIRSF015840 - DUF284_TM_euk (Pirsf link)

    Interpro entry IPR005045 : Protein of unknown function DUF284, transmembrane eukaryotic (Interpro link)

    Interpro description:
    Members of this family have no known function. They have predicted transmembrane helices.

    Proteins where this domain is known:
    PF11_0343   


    PIRSF015892 - N-myristl_transf (Pirsf link)

    Interpro entry IPR000903 : Myristoyl-CoA:protein N-myristoyltransferase (Interpro link)

    Interpro description:
    Myristoyl-CoA:protein N-myristoyltransferase (Nmt) is the enzyme responsible for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and viral proteins. Nmt is a monomeric protein of about 50 to 60 kD whose sequence appears to be well conserved.

    Proteins where this domain is known:
    PF14_0127   


    PIRSF015894 - Skb1_MeTrfase (Pirsf link)

    Interpro entry IPR007857 : Skb1 methyltransferase (Interpro link)

    Interpro description:
    The human homologue of Saccharomyces cerevisiae Skb1 (Shk1 kinase-binding protein 1) is a protein methyltransferase. These proteins seem to play a role in Jak signalling.

    Proteins where this domain is known:
    PF13_0323   


    PIRSF015901 - NAC_alpha (Pirsf link)

    Interpro entry IPR016641 : (Interpro link)

    Interpro description:

    This group represents a nascent polypeptide-associated complex, alpha subunit.

    Proteins where this domain is known:
    PFF1050w   


    PIRSF015919 - TFIIH_SSL1 (Pirsf link)

    Interpro entry IPR012170 : TFIIH basal transcription factor complex, subunit SSL1 (Interpro link)

    Interpro description:

    This group represents a TFIIH basal transcription factor complex, subunit SSL1.

    Proteins where this domain is known:
    MAL13P1.76   


    PIRSF015945 - ATPase_V1_F_euk (Pirsf link)

    Interpro entry IPR005772 : ATPase, V1 complex, subunit F, eukaryotic (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    This entry represents subunit F found in the V1 complex of V-ATPases in eukaryotes. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF11_0412   


    PIRSF015947 - 26S_Psome_Rpn2 (Pirsf link)

    Interpro entry IPR016642 : 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn2/Psmd1 subunit (Interpro link)

    Interpro description:

    Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.

    The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12). Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation.

    This group represents a 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn2/Psmd1 subunit.

    Proteins where this domain is known:
    PF14_0632   


    PIRSF015952 - U3snoRNP11 (Pirsf link)

    Interpro entry IPR007144 : Small-subunit processome, Utp11 (Interpro link)

    Interpro description:

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.

    This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3.

    Proteins where this domain is known:
    PFL2295w   


    PIRSF015965 - 26S_Psome_Rpn1 (Pirsf link)

    Interpro entry IPR016643 : 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn1 subunit (Interpro link)

    Interpro description:

    Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.

    The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12). Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation.

    This group represents a 26S proteasome regulatory complex, non-ATPase subcomplex, Rpn1 (regulatory-particle non-ATPase subunit 1). This subunit is essential for embryogenesis in Arabidopsis thaliana.

    Proteins where this domain is known:
    PFB0260w   


    PIRSF016013 - AtER_Rer1p (Pirsf link)

    Interpro entry IPR004932 : Retrieval of early ER protein Rer1 (Interpro link)

    Interpro description:

    RER1 family proteins are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex.

    Proteins where this domain is known:
    PFI0150c   


    PIRSF016089 - SPC22 (Pirsf link)

    Interpro entry IPR007653 : Signal peptidase 22 kDa subunit (Interpro link)

    Interpro description:
    Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC), which consists of four subunits in yeast and five in mammals. This family is is described as similar to microsomal signal peptidase 23 kDa subunit. Found in eukaryotes.

    Proteins where this domain is known:
    PFI0215c   


    PIRSF016104 - GPI2 (Pirsf link)

    Interpro entry IPR009450 : Phosphatidylinositol N-acetylglucosaminyltransferase (Interpro link)

    Interpro description:

    Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.

    Proteins where this domain is known:
    PFI0535w   


    PIRSF016255 - eIF3e_su6 (Pirsf link)

    Interpro entry IPR016650 : (Interpro link)

    Interpro description:

    This group represents an eukaryotic translation initiation factor 3, subunit 6.

    Proteins where this domain is known:
    PFE1405c   


    PIRSF016281 - EIF-3_zeta (Pirsf link)

    Interpro entry IPR007783 : Eukaryotic translation initiation factor 3, subunit 7 (Interpro link)

    Interpro description:
    This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals.

    Proteins where this domain is known:
    PF10_0077   


    PIRSF016305 - LCM_mtfrase (Pirsf link)

    Interpro entry IPR016651 : (Interpro link)

    Interpro description:

    Leucine carboxymethyltransferases methylate the carboxyl group of leucine residues to form alpha-leucine ester residues. It includes LCTM1 which regulate the activity of serine/threonine phosphatase 2A (PP2A) through methylation of the C-terminal leucine residue of the catalytic subunit of PP2A . This affects the heteromultimeric composition of PP2A which in turn affects protein recognition and substrate specificity. Like many other methyltransferases LCTM1 uses S-adenosylmethionine (SAM) as the methyl donor. LCTM1 contains the common SAM-dependent methyltransferase core fold, with various insertions and additions creating a specific PP2A binding site.

    This group represents the LCMT1 subgroup of leucine carboxymethyltransferases.

    Proteins where this domain is known:
    PF14_0376   


    PIRSF016308 - UBP (Pirsf link)

    Interpro entry IPR016652 : (Interpro link)

    Interpro description:

    This group represents an ubiquitin-specific protease (ubiquitin carboxyl-terminal hydrolase).

    Proteins where this domain is known:
    PFD0655c   


    PIRSF016323 - tRNA_m1G_mtfrase_met (Pirsf link)

    Interpro entry IPR016653 : (Interpro link)

    Interpro description:

    This group represents a tRNA (guanine-N(1)-)-methyltransferase, eukaryotic type.

    Proteins where this domain is known:
    PF11_0198   


    PIRSF016325 - Phstyr_phstse_ac (Pirsf link)

    Interpro entry IPR004327 : Phosphotyrosyl phosphatase activator, PTPA (Interpro link)

    Interpro description:
    Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognized phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumour suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1.

    Proteins where this domain is known:
    PF14_0280   


    PIRSF016393 - Enh_rudimentary (Pirsf link)

    Interpro entry IPR000781 : (Interpro link)

    Interpro description:
    The Drosophila protein 'enhancer of rudimentary' (gene (e(r)) is a small protein of 104 residues whose function is not yet clear. From an evolutionary point of view, it is highly conserved and has been found to exist in probably all multicellular eukaryotic organisms. It has been proposed that this protein plays a role in the cell cycle.

    Proteins where this domain is known:
    PF10_0370   


    PIRSF016394 - U6_snRNA_Lsm2 (Pirsf link)

    Interpro entry IPR016654 : (Interpro link)

    Interpro description:

    This group represents an U6 snRNA-associated Sm-like protein LSm2.

    Proteins where this domain is known:
    PFE1020w   


    PIRSF016396 - Prefoldin_subunit_3 (Pirsf link)

    Interpro entry IPR016655 : (Interpro link)

    Interpro description:

    This group represents a prefoldin, subunit 3.

    Proteins where this domain is known:
    MAL7P1.94   


    PIRSF016477 - Prefoldin_subunit_4 (Pirsf link)

    Interpro entry IPR016661 : (Interpro link)

    Interpro description:

    This group represents a prefoldin, subunit 4.

    Proteins where this domain is known:
    PFI0220w   


    PIRSF017190 - Rbsml_synth_fac_NIP7 (Pirsf link)

    Interpro entry IPR016686 : Ribosome biogenesis factor, NIP7 (Interpro link)

    Interpro description:

    This entry represents 60S ribosome subunit biogenesis protein Nip7, which is required for proper 27S pre-rRNA processing and 60S ribosome subunit assembly. In yeast, Nip7 interacts with nucleolar proteins such as Nol8, and with the exosome subunit Rrp43p. Nip7 contains a PUA domain.

    Proteins where this domain is known:
    PF14_0635   


    PIRSF017199 - mRNA_splic_U5 (Pirsf link)

    Interpro entry IPR004123 : mRNA splicing factor, thioredoxin-like U5 snRNP (Interpro link)

    Interpro description:

    Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of 2 cysteine thiol groups to a disulphide, accompanied by the transfer of 2 electrons and 2 protons. The net result is the covalent interconversion of a disulphide and a dithiol.

    Compared to human thioredoxin, human U5 snRNP-specific protein U5-15kD contains 37 additional residues that may cause structural changes which most likely form putative binding sites for other spliceosomal proteins or RNA. Although U5-15kD apparently lacks protein disulphide isomerase activity, it is strictly required for pre-mRNA splicing.

    Proteins where this domain is known:
    PFL1520w   


    PIRSF017205 - ERO1 (Pirsf link)

    Interpro entry IPR007266 : Endoplasmic reticulum oxidoreductin 1 (Interpro link)

    Interpro description:
    Members of this family are required for the formation of disulphide bonds in the endoplasmic reticulum.

    Proteins where this domain is known:
    PF11_0251   


    PIRSF017207 - UCP017207_TM-p85 (Pirsf link)

    Interpro entry IPR016687 : (Interpro link)

    Interpro description:

    This group represents a predicted transmembrane protein 85.

    Proteins where this domain is known:
    PF14_0335   


    PIRSF017222 - eIF2A (Pirsf link)

    Interpro entry IPR011387 : (Interpro link)

    Interpro description:

    This group represents a translation initiation factor eIF-2A. Please see the following relevant reference:.

    Proteins where this domain is known:
    PF14_0360   


    PIRSF017259 - tRNA_mtfrase_TRM11 (Pirsf link)

    Interpro entry IPR016691 : (Interpro link)

    Interpro description:

    This group represents a tRNA guanosine-2'-O-methyltransferase, TRM11 type.

    Proteins where this domain is known:
    PF13_0236   


    PIRSF017269 - GCD14 (Pirsf link)

    Proteins where this domain is known:
    PF13_0087   


    PIRSF017479 - TRAPP_I_complex_Trs31 (Pirsf link)

    Interpro entry IPR016696 : (Interpro link)

    Interpro description:

    This group represents a TRAPP I complex, Trs31 subunit.

    Proteins where this domain is known:
    PF14_0358   


    PIRSF017888 - CPSF-25 (Pirsf link)

    Interpro entry IPR016706 : (Interpro link)

    Interpro description:

    This group represents a cleavage and polyadenylation specificity factor, 25 kDa subunit.

    Proteins where this domain is known:
    PFA0450c   


    PIRSF018293 - TRAPP_I_complex_Bet3 (Pirsf link)

    Interpro entry IPR016721 : (Interpro link)

    Interpro description:

    This group represents a TRAPP I complex, Bet3 subunit.

    Proteins where this domain is known:
    PFD0895c   


    PIRSF018300 - DNA_pol_alph_2 (Pirsf link)

    Interpro entry IPR016722 : (Interpro link)

    Interpro description:

    This group represents a DNA polymerase alpha, subunit B.

    Proteins where this domain is known:
    PF14_0602   


    PIRSF018425 - PolyA_polymerase (Pirsf link)

    Interpro entry IPR014492 : Poly(A) polymerase (Interpro link)

    Interpro description:

    Members of this group are poly(A) polymerases (polynucleotide adenylyltransferases, PAP). In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. Poly(A) polymerase, the central enzyme of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA.

    The catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Other conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.

    The central domain of poly(A) polymerase shares structural similarity with the allosteric activity domain of ribonucleotide reductase R1, which comprises a four-helix bundle and a three-stranded mixed beta-sheet. Even though the two enzymes bind ATP, the ATP-recognition motifs are different. The C-terminal domain is predicted to be an RNA-binding domain because it folds into a compact domain reminiscent of the RNA-recognition motif fold.

    The C-terminal region beyond the predicted RNA-binding domain is only conserved in vertebrates and is dispensable for catalytic activity in vitro. The extended C-terminal domain of vertebrate PAPs is rich in serines and threonines, and enzyme activity can be down regulated by phosphorylation at multiple sites. The extreme C terminus of PAP is also the target for another type of regulation. The U1A protein, a component of the U1 snRNP which functions in 5 splice site recognition, is known to inhibit polyadenylation of its own mRNA by binding to PAP. The C terminus of PAP is also involved in protein-protein interactions with the splicing factor U2AF65 and the snRNP protein U1-70K.

    Note thatcontains an unrelated group with at least some of the members displaying poly(A) polymerase activity.

    Proteins where this domain is known:
    PFF1240w   


    PIRSF018497 - V-ATP_synth_D (Pirsf link)

    Interpro entry IPR016727 : ATPase, V0 complex, subunit D (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    This entry represents subunit D from the V0 complex of V-ATPases, which are involved in the translocation of protons across a membrane. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity .

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF14_0615   


    PIRSF019693 - VAMP-associated (Pirsf link)

    Interpro entry IPR016763 : (Interpro link)

    Interpro description:

    This group represents a Vesicle-associated membrane protein.

    Proteins where this domain is known:
    PF14_0377   


    PIRSF022536 - A612L_SET (Pirsf link)

    Interpro entry IPR009207 : (Interpro link)

    Interpro description:
    This group represents a predicted histone methyltransferase, PBCV type.

    Proteins where this domain is known:
    PFL0690c   


    PIRSF022993 - Profilin_apicomplexa (Pirsf link)

    Interpro entry IPR016814 : (Interpro link)

    Interpro description:

    This group represents a profilin, apicomplexa type.

    Proteins where this domain is known:
    PFI1565w   


    PIRSF023322 - DUF841_euk (Pirsf link)

    Interpro entry IPR008559 : (Interpro link)

    Interpro description:
    This family consists of several eukaryotic proteins with no known function.

    Proteins where this domain is known:
    PF13_0331   


    PIRSF023577 - ENOS_interacting (Pirsf link)

    Interpro entry IPR016818 : (Interpro link)

    Interpro description:

    This group represents a nitric oxide synthase-interacting protein.

    Proteins where this domain is known:
    PFE0900w   


    PIRSF025023 - Spt4 (Pirsf link)

    Interpro entry IPR009287 : Transcription initiation Spt4 (Interpro link)

    Interpro description:

    This family consists of several eukaryotic transcription initiation Spt4 proteins. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.

    Proteins where this domain is known:
    PF10_0293   


    PIRSF025798 - Cables (Pirsf link)

    Interpro entry IPR012388 : Cdk5 and c-Abl linker protein cables (Interpro link)

    Interpro description:

    This group represents a Cdk5 and c-Abl linker protein cables. Please see the following relevant references:.

    Proteins where this domain is known:
    PFF0270c   


    PIRSF027110 - PREG (Pirsf link)

    Interpro entry IPR012389 : (Interpro link)

    Interpro description:

    This group represents a negative regulatory factor PREG. Please see the following relevant reference:.

    Proteins where this domain is known:
    PFE0920c   


    PIRSF028729 - E3_ubiquit_lig_SCF_Skp (Pirsf link)

    Interpro entry IPR016897 : (Interpro link)

    Interpro description:

    This group represents an E3 ubiquitin ligase SCF complex, Skp subunit.

    Proteins where this domain is known:
    MAL13P1.337   


    PIRSF028763 - RNA_pol_Rpc34 (Pirsf link)

    Interpro entry IPR007832 : RNA polymerase Rpc34 (Interpro link)

    Interpro description:
    The family comprises a subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of Saccharomyces cerevisiae RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and therefore participates in Pol III recruitment.

    Proteins where this domain is known:
    PF14_0207   


    PIRSF028836 - ISN1 (Pirsf link)

    Interpro entry IPR009453 : IMP-specific 5-nucleotidase (Interpro link)

    Interpro description:

    The Saccharomyces cerevisiae ISN1 (YOR155c) gene encodes an IMP-specific 5'-nucleotidase, which catalyses degradation of IMP to inosine as part of the purine salvage pathway.

    Proteins where this domain is known:
    PFL0305c   


    PIRSF029271 - Pdx1 (Pirsf link)

    Interpro entry IPR001852 : (Interpro link)

    Interpro description:

    Snz1p is a highly conserved protein involved in growth arrest in Saccharomyces cerevisiae (Baker's yeast). Sor1 (singlet oxygen resistance) is essential in pyridoxine (vitamin B6) synthesis in Cercospora nicotianae and Aspergillus flavus. Pyridoxine quenches singlet oxygen at a rate comparable to that of vitamins C and E, two of the most highly efficient biological antioxidants, suggesting a previously unknown role for pyridoxine in active oxygen resistance..

    Proteins where this domain is known:
    PFF1025c   


    PIRSF032184 - ATPase_V1_H (Pirsf link)

    Interpro entry IPR004908 : ATPase, V1 complex, subunit H (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    This entry represents subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0034   


    PIRSF036363 - PPP_BSU1 (Pirsf link)

    Interpro entry IPR012391 : Serine/threonine protein phosphatase, BSU1 (Interpro link)

    Interpro description:

    This group represents a serine/threonine protein phosphatase, BSU1 type. Please see the following relevant reference:.

    Proteins where this domain is known:
    PF14_0630   


    PIRSF036424 - eIF3b (Pirsf link)

    Interpro entry IPR011400 : Translation initiation factor eIF-3b (Interpro link)

    Interpro description:

    This group represents a translation initiation factor eIF-3b, which binds to the 40S ribosome and promotes the binding of methionyl-tRNAi and mRNA. eIF-3 is composed of at least 12 different subunits.

    Proteins where this domain is known:
    PFE0885w   


    PIRSF036432 - Diphthine_synth (Pirsf link)

    Interpro entry IPR004551 : Diphthine synthase (Interpro link)

    Interpro description:

    Diphthine synthase, also known as diphthamide biosynthesis S-adenosylmethionine-dependent methyltransferase, participates in the modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. It is required for the methylation step in dipthamide biosynthesis. The protein was characterised in Saccharomyces cerevisiae and designated DPH5.

    Proteins where this domain is known:
    PF10_0087   


    PIRSF036578 - RFC1 (Pirsf link)

    Interpro entry IPR012178 : DNA replication factor C, large subunit (Interpro link)

    Interpro description:

    This group represents a DNA replication factor C, large subunit.

    Proteins where this domain is known:
    PFB0895c   


    PIRSF036580 - Cyclin_L (Pirsf link)

    Interpro entry IPR017060 : (Interpro link)

    Interpro description:

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.

    This group represents a cyclin, L type.

    Proteins where this domain is known:
    PF13_0022   


    PIRSF036805 - 26S_protsm_s5a (Pirsf link)

    Interpro entry IPR014624 : (Interpro link)

    Interpro description:

    This group represents a predicted 26S proteasome regulatory complex, non-ATPase subcomplex, subunit s5a, Plasmodium type. Please see the following relevant reference:.

    Proteins where this domain is known:
    PF08_0109   


    PIRSF036945 - Spt5 (Pirsf link)

    Interpro entry IPR017071 : Transcription elongation factor Spt5 (Interpro link)

    Interpro description:

    This family consists of several eukaryotic transcription elongation Spt5 proteins. These proteins contain two copies of a domain (Supt5; that is characteristic of proteins involved in chromatin regulation. An NGN domain separates the Supt5 domains. In yeast Spt5 protein, this domain possesses a RNP-like fold and it is thought to confer affinity for Spt4 protein. Supt5 domains are followed by four to five copies of a KOW domain, present in many ribosomal proteins.

    Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans, including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.

    This information was partially derived from InterPro.

    Proteins where this domain is known:
    PFF0535c   


    PIRSF037010 - Splicing_factor_3B_subunit_5 (Pirsf link)

    Interpro entry IPR017089 : (Interpro link)

    Interpro description:

    This group represents a splicing factor 3B, subunit 5.

    Proteins where this domain is known:
    PF13_0296   


    PIRSF037023 - Rhomboid-like_ROM4_ROM5 (Pirsf link)

    Interpro entry IPR017092 : Peptidase S54, rhomboid-like, Rom4/Rom5, apicomplexa (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This entry represents Apicomplexa rhomboid-like proteins, Rom4/Rom5, which are members of the S54 peptidase family of proteins. These proteins are putative serine protease involved in intra-membrane proteolysis and the subsequent release of polypeptides from their membrane anchors. They cleave type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains.

    Proteins where this domain is known:
    PFE0340c   


    PIRSF037093 - Coatomer_gamma_subunit (Pirsf link)

    Interpro entry IPR017106 : (Interpro link)

    Interpro description:

    This group represents a coatomer, gamma subunit.

    Proteins where this domain is known:
    PF11_0463   


    PIRSF037097 - AP4_complex_epsilon (Pirsf link)

    Interpro entry IPR017109 : (Interpro link)

    Interpro description:

    Adapter-like complex 4 (AP-4) is a heterotetramer composed of two large adaptins (epsilon-type subunit AP4E1 and beta-type subunit AP4B1), a medium adaptin (mu-type subunit AP4M1) and a small adaptin (sigma-type AP4S1). It is a subunit of a novel type of clathrin- or non-clathrin-associated protein coat involved in targeting proteins from the trans-Golgi network (TGN) to the endosomal-lysosomal system.

    This group represents an adaptor protein complex AP-4, epsilon subunit.

    Proteins where this domain is known:
    PFI0200c   


    PIRSF037125 - D-site_20S_pre-rRNA_nuclease (Pirsf link)

    Interpro entry IPR017117 : (Interpro link)

    Interpro description:

    This group represents a D-site 20S pre-rRNA nuclease.

    Proteins where this domain is known:
    PFD0905w   


    PIRSF037188 - U6_snRNA_Lsm7 (Pirsf link)

    Interpro entry IPR017132 : (Interpro link)

    Interpro description:

    This group represents an U6 snRNA-associated Sm-like protein LSm7.

    Proteins where this domain is known:
    PFL0460w   


    PIRSF037336 - IspG_like (Pirsf link)

    Interpro entry IPR017178 : 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthase, atypical (Interpro link)

    Interpro description:

    This protein previously of unknown biochemical function is essential in Escherichia coli. It has now been characterised as 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate synthase, which converts 2C-methyl-D-erythritol 2,4-cyclodiphosphate (ME-2,4CPP) into 1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate in the sixth step of nonmevalonate terpenoid biosynthesis. The family is restricted to bacteria, where it is widely but not universally distributed. No homology can be detected between this family and other proteins.

    This entry represents a group of atypical 4-hydroxy-3-methylbut-2-en-1-yl diphosphate synthases which contain a partially-duplicated domain.

    Proteins where this domain is known:
    PF10_0221   


    PIRSF037512 - PxSR (Pirsf link)

    Interpro entry IPR017229 : (Interpro link)

    Interpro description:

    This group represents a multidomain scavenger receptor-like protein PxSR.

    Proteins where this domain is known:
    PF14_0067   


    PIRSF037671 - Transprt_Chloroquine_res (Pirsf link)

    Interpro entry IPR017258 : (Interpro link)

    Interpro description:

    This group represents a transmembrane transporter protein, chloroquine resistant type.

    Proteins where this domain is known:
    MAL7P1.27   


    PIRSF037677 - DNA_mis_repair_Msh6 (Pirsf link)

    Interpro entry IPR017261 : (Interpro link)

    Interpro description:

    This group represents a DNA mismatch repair protein Msh6.

    Proteins where this domain is known:
    PFE0270c   


    PIRSF037736 - SCO1 (Pirsf link)

    Interpro entry IPR017276 : Synthesis of cytochrome c oxidase, Sco1/Sco2 (Interpro link)

    Interpro description:

    This entry represents cytochrome c oxidase assembly factors Sco1 and Sco2 (Synthesis of Cytochrome c Oxidase, factors 1 and 2), mitochondrial inner membrane-tethered metallochaperones that have regulatory roles in the maintenance of cellular copper homeostasis. These proteins are essential for the assembly of the catalytic core of cytochrome c oxidase (COX or complex IV), as well as other roles in copper homeostasis such as mitochondrial redox signalling. Both Sco1 and Sco2 contain highly conserved CXXXC motifs thought to be required for copper binidng.

    COX is the terminal enzyme of the energy transducing respiratory chain in eukaryotes and certain prokaryotes. It catalyses the transfer of electrons from cytochrome c to molecular oxygen and pumps protons across the mitochondrial inner membrane to establish a proton gradient for ATP synthesis. It consists of 12-13 protein subunits, with 3 subunits (Cox1-Cox3) forming the enzyme core. COX uses haem and copper as cofactors: Cox1 contains a 1-copper centre (CuB) that interacts with the haem moiety and Cox2 contains a 2-copper centre (CuA). Sco1 and Sco2 act as copper chaperones, transporting copper to the CuA site in Cox2, and are thought to have cooperative functions in COX assembly. In addition, human Sco2 is also the downstream mediator of the balance between the utilization of respiratory and glycolytic pathways and both Sco1 and Sco2 may have regulatory roles in regulating cellular copper levels (homeostasis). Sco2 may have a copper-level-detection signalling role, acting upstream and in conjunction with Sco1.

    Defects in Sco1 are a cause of cytochrome c oxidase deficiency (COX deficiency) (OMIM:220110), a clinically heterogeneous disorder with features ranging from isolated myopathy to severe multisystem disease, and onset from infancy to adulthood. Defects in Sco2 are the cause of fatal infantile cardioencephalomyopathy with cytochrome c oxidase deficiency (FIC) (OMIM:604377, OMIM:220110), which is characterised by hypertrophic cardiomyopathy, lactic acidosis, and gliosis.

    Proteins where this domain is known:
    PF07_0034   


    PIRSF037755 - Mettl2_prd (Pirsf link)

    Interpro entry IPR017280 : Methyltransferase, METTL2, predicted (Interpro link)

    Interpro description:

    This group represents a predicted methyltransferase, METTL2 type.

    Proteins where this domain is known:
    PFL2305w   


    PIRSF037759 - Histone_Asf1 (Pirsf link)

    Interpro entry IPR017282 : (Interpro link)

    Interpro description:

    This group represents a histone deposition protein Asf1.

    Proteins where this domain is known:
    PFL1180w   


    PIRSF037900 - Subtilisin_rel_PfSUB_1 (Pirsf link)

    Interpro entry IPR017314 : (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins and others are activated by this route. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C-terminus.

    This entry contains serine peptidases belonging to MEROPS peptidase family S8A (subtilisin family, clan SB). All of the peptidases in this entry derive from Plasmodium spp. and are called 'subtilisin-like peptidase 1'.

    The peptidase in Plasmodium falciparum (isolate 3D7) termed pfSUB1 is a component of the exoneme. PfSUB1 mediates the proteolytic maturation of at least two essential members of another enzyme family called SERA. This proteolytic processing event is required for the release of viable parasites from the host erythrocyte.

    Proteins where this domain is known:
    PFE0370c   


    PIRSF037913 - His_deacetylse_1 (Pirsf link)

    Interpro entry IPR003084 : Histone deacetylase (Interpro link)

    Interpro description:
    Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal of the acetyl group. Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are all members of this ancient protein superfamily.

    HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence-specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positivecharge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.

    HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence- specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positive charge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.

    Proteins where this domain is known:
    PFI1260c   


    PIRSF037949 - Transl_init_eIF-3_RNA-bind (Pirsf link)

    Interpro entry IPR017334 : (Interpro link)

    Interpro description:

    This group represents a translation initiation factor 3, RNA-binding subunit.

    Proteins where this domain is known:
    MAL8P1.83   


    PIRSF037956 - UCP037956_ZnF_Ran (Pirsf link)

    Interpro entry IPR017337 : (Interpro link)

    Interpro description:

    This group represents an uncharacterised protein with zinc finger Ran-binding domain, ZRANB2-type.

    Proteins where this domain is known:
    PFD0405c   


    PR00052 - FIBRILLARIN (Prints link)

    Interpro entry IPR000692 : Fibrillarin (Interpro link)

    Interpro description:
    Fibrillarin is a component of a nucleolar small nuclear ribonucleoprotein (SnRNP), functioning in vivo in ribosomal RNA processing. It is associated with U3, U8 and U13 small nuclear RNAs in mammals and is similar to the yeast NOP1 protein. Fibrillarin has a well conserved sequence of around 320 amino acids, and contains 3 domains, an N-terminal Gly/Arg-rich region; a central domain resembling other RNA-binding proteins and containing an RNP-2-like consensus sequence; and a C-terminal alpha-helical domain. An evolutionarily related pre-rRNA processing protein, which lacks the Gly/Arg-rich domain, has been found in various archaebacteria.

    Proteins where this domain is known:
    PF14_0068   


    PR00058 - RIBOSOMALL5 (Prints link)

    Interpro entry IPR005485 : Ribosomal protein L5, eukaryotic (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family consists of ribosomal protein L5 from eukaryotes. The ribosomal 5S RNA is the only known rRNA species to bind a ribosomal protein before its assembly into the ribosomal subunits . In eukaryotes, the 5S rRNA molecule binds one protein species, a 34-kDa protein which has been implicated in the intracellular transport of 5 S rRNA..

    Proteins where this domain is known:
    PF14_0230   


    PR00062 - RIBOSOMALL20 (Prints link)

    Interpro entry IPR005812 : Ribosomal protein L20, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family covers bacterial ribosomal protein L20 and its chloroplast equivalent.

    Proteins where this domain is known:
    PF14_0709   


    PR00063 - RIBOSOMALL27 (Prints link)

    Interpro entry IPR001684 : Ribosomal protein L27 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.

    Proteins where this domain is known:
    PF10_0332   


    PR00064 - RIBOSOMALL35 (Prints link)

    Interpro entry IPR001706 : Ribosomal protein L35 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L35 is a basic protein of 60 to 70 amino-acid residues from the large (50S) subunit. Like many basic polypeptides, L35 completely inhibits ornithine decarboxylase when present unbound in the cell, but the inhibitory function is abolished upon its incorporation into ribosomes. It belongs to a family of ribosomal proteins, including L35 from bacteria, plant chloroplast, red algae chloroplasts and cyanelles. In plants it is a nuclear encoded gene product, which suggests a chloroplast-to-nucleus relocation during the evolution of higher plants.

    Proteins where this domain is known:
    PFI0375w   


    PR00075 - FACDDSATRASE (Prints link)

    Interpro entry IPR015876 : (Interpro link)

    Interpro description:

    Fatty acid desaturases are enzymes that catalyze the insertion of a double bond at the delta position of fatty acids.

    There seems to be two distinct families of fatty acid desaturases which do not seem to be evolutionary related: the first contains stearoyl-CoA desaturase (SCD); the second includes plant stearoyl-acyl-carrier protein and cyanobacteria desA protein.

    SCD is a key regulatory enzyme of unsaturated fatty acid biosynthesis. In association with cytochrome b5 and NADP-dependent cytochrome b5 reductase, it constitutes part of a microsomal membrane-bound 3-component system in animals and fungi. SCD contains 4 putative transmembrane (TM) regions that anchor it in the microsomal membrane. SCD uses oxygen and electrons from reduced cytochrome b5 to catalyse the insertion of a cis double bond between carbons 9 and 10 of a spectrum of fatty acids. The preferred substrates of SCD are palmitoyl-CoA and stearoyl-CoA, which are converted to palmitoleic (16:1) and oleic (18:1) acids respectively. These unsaturated molecules are the major storage form of fatty acids (as triacylglycerols) in adipocytes.

    Proteins where this domain is known:
    PFE0555w   


    PR00076 - 6PGDHDRGNASE (Prints link)

    Interpro entry IPR006183 : 6-phosphogluconate dehydrogenase (Interpro link)

    Interpro description:

    6-Phosphogluconate dehydrogenase (6PGD) is an oxidative carboxylase that catalyses the decarboxylating reduction of 6-phosphogluconate into ribulose 5-phosphate in the presence of NADP. This reaction is a component of the hexose mono-phosphate shunt and pentose phosphate pathways (PPP). Prokaryotic and eukaryotic 6PGD are proteins of about 470 amino acids whose sequence are highly conserved. The protein is a homodimer in which the monomers act independently: each contains a large, mainly alpha-helical domain and a smaller beta-alpha-beta domain, containing a mixed parallel and anti-parallel 6-stranded beta sheet. NADP is bound in a cleft in the small domain, the substrate binding in an adjacent pocket.

    Proteins where this domain is known:
    PF14_0520   


    PR00077 - GPDHDRGNASE (Prints link)

    Interpro entry IPR006168 : NAD-dependent glycerol-3-phosphate dehydrogenase (Interpro link)

    Interpro description:
    NAD-dependent glycerol-3-phosphate dehydrogenase (GPD) catalyzes the reversible reduction of dihydroxyacetone phosphate to glycerol-3-phosphate. It is a cytoplasmic protein, active as a homodimer, each monomer containing an N-terminal NAD binding site. In insects, it acts in conjunction with a mitochondrial alpha-glycerophosphate oxidase in the alpha-glycerophosphate cycle, which is essential for the production of energy used in insect flight.

    Proteins where this domain is known:
    PF11_0157    PFL0780w   


    PR00080 - SDRFAMILY (Prints link)

    Interpro entry IPR002198 : Short-chain dehydrogenase/reductase SDR (Interpro link)

    Interpro description:
    The short-chain dehydrogenases/reductases family (SDR) is a very large family of enzymes, most of which are known to be NAD- or NADP-dependent oxidoreductases. As the first member of this family to be characterised was Drosophila alcohol dehydrogenase, this family used to be called 'insect-type', or 'short-chain' alcohol dehydrogenases. Most member of this family are proteins of about 250 to 300 amino acid residues. Most dehydrogenases possess at least 2 domains, the first binding the coenzyme, often NAD, and the second binding the substrate. This latter domain determines the substrate specificity and contains amino acids involved in catalysis. Little sequence similarity has been found in the coenzyme binding domain although there is a large degree of structural similarity, and it has therefore been suggested that the structure of dehydrogenases has arisen through gene fusion of a common ancestral coenzyme nucleotide sequence with various substrate specific domains.

    Proteins where this domain is known:
    PFI1125c   


    PR00081 - GDHRDH (Prints link)

    Interpro entry IPR002347 : Glucose/ribitol dehydrogenase (Interpro link)

    Interpro description:
    Glucose dehydrogenase catalyses the oxidation of D-glucose without prior phosphorylation to D-beta-gluconolactone using NAD or NADP as a coenzyme. The enzyme is a tetrameric protein, each of the 4 identical subunits containing 262 amino acid residues. This family is a subset of a more general family of short-chain dehydrogenases and reductases. A match to this extension indicates that the protein is not an alcohol dehydrogenase, but another type of dehydrogenase or reductase.

    Proteins where this domain is known:
    PFD1035w    PFF1265w    PFI1125c   


    PR00082 - GLFDHDRGNASE (Prints link)

    Interpro entry IPR006095 : Glutamate/phenylalanine/leucine/valine dehydrogenase (Interpro link)

    Interpro description:

    Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.

    Glutamate dehydrogenases (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids.

    Leucine dehydrogenase (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.

    Phenylalanine dehydrogenase (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate.

    Valine dehydrogenase (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate.

    Proteins where this domain is known:
    PF14_0286   


    PR00086 - LLDHDRGNASE (Prints link)

    Interpro entry IPR001557 : L-lactate/malate dehydrogenase (Interpro link)

    Interpro description:

    This family contains both lactate and malate dehydrogenases. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.

    L-lactate dehydrogenase (LDH) catalyses the reversible NAD-dependent interconversion of pyruvate to L-lactate. In vertebrate muscles and in lactic acid bacteria it represents the final step in anaerobic glycolysis. This tetrameric enzyme is present in prokaryotic and eukaryotic organisms. In vertebrates there are three isozymes of LDH: the M form (LDH-A), found predominantly in muscle tissues; the H form (LDH-B), found in heart muscle and the X form (LDH-C), found only in the spermatozoa of mammals and birds. In birds and crocodilian eye lenses, LDH-B serves as a structural protein and is known as epsilon-crystallin.

    L-2-hydroxyisocaproate dehydrogenase (L-hicDH) catalyses the reversible and stereospecific interconversion between 2-ketocarboxylic acids and L-2-hydroxy-carboxylic acids. L-hicDH is evolutionary related to LDH's.

    Proteins where this domain is known:
    PF13_0144   


    PR00096 - GATASE (Prints link)

    Interpro entry IPR011702 : Glutamine amidotransferase superfamily (Interpro link)

    Interpro description:

    Glutamine amidotransferase (GATase) activity involves the removal of the ammonia group from a glutamate molecule and its subsequent transfer to a specific substrate, thus creating a new carbon-nitrogen group on the substrate. This activity is found in a range of biosynthetic enzymes, including glutamine amidotransferase, anthranilate synthase component II, p-aminobenzoate, and glutamine-dependent carbamoyl-transferase (CPSase). Glutamine amidotransferase (GATase) domains can occur either as single polypeptides, as in glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. On the basis of sequence similarities two classes of GATase domains have been identified, class-I (also known as trpG-type) and class-II (also known as purF-type). Class-I GATase domains are defined by a conserved catalytic triad consisting of cysteine, histidine and glutamate. Class-I GPTase domains have been found in the following enzymes, the second component of anthranilate synthase and 4-amino-4-deoxychorismate (ADC) synthase; CTP synthase; GMP synthase; glutamine-dependent carbamoyl-phosphate synthase; phosphoribosylformylglycinamidine synthase II; and the histidine amidotransferase hisH.

    Proteins where this domain is known:
    PF13_0044    PFI1100w   


    PR00097 - ANTSNTHASEII (Prints link)

    Interpro entry IPR006220 : Anthranilate synthase component II/delta crystallin (Interpro link)

    Interpro description:
    Anthranilate synthase (ASase) is a tetrameric protein comprising two copies each of components I and II. The protein catalyses the first step in the tryptophan biosynthetic pathway, namely the conversion of chorismate and an ammonium ion to anthranilate. Component I obtains this ion from ammonia, whereas component II obtains the ion from glutamine using the glutamine amidotransferase (GATase) activity.

    In some bacteria, such as Escherichia coli, component II can be much larger than in other organisms, due to the presence of phosphoribosyl-anthranilate transferase (PRTase) activity. This is the second step in tryptophan biosynthesis and results in the addition of 5-phosphoribosyl-1-pyrophosphate to anthranilate to create N-5'-phosphoribosyl-anthranilate. Some studies have suggested that the larger component II could have arisen by gene fusion, a hypothesis supported by the fact that the two activities are found in discrete domains that are physically separated in the 3D model.

    Proteins where this domain is known:
    PFI1100w   


    PR00098 - CPSASE (Prints link)

    Interpro entry IPR005483 : Carbamoyl phosphate synthase, large subunit (Interpro link)

    Interpro description:

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.

    This entry represents the large subunit of carbamoyl phosphate synthase.

    Proteins where this domain is known:
    PF13_0044   


    PR00099 - CPSGATASE (Prints link)

    Interpro entry IPR001317 : Carbamoyl phosphate synthase, GATase region (Interpro link)

    Interpro description:

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.

    This entry represents the domain responsible for GATase (glutamine amidotransferase) activity in CPSases, which catalyses the hydrolysis of glutamine to glutamate and ammonia. This reaction occurs at the active site on the small subunit of CPSases. This function has been detected in some other enzymes, including aminodeoxychorismate synthase and anthranilate synthase component II, all of which show sequence similarity in the area thought to contain the GATase activity. The active site contains a conserved Cys residue, which is necessary for catalytic activity, and several conserved residues in the areas surrounding this Cys have also been found to be important.

    Proteins where this domain is known:
    PF13_0044   


    PR00106 - DNAPOLB (Prints link)

    Interpro entry IPR017966 : (Interpro link)

    Interpro description:

    DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation.

    A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.

    Proteins where this domain is known:
    PFD0590c   


    PR00109 - TYRKINASE (Prints link)

    Interpro entry IPR001245 : Tyrosine protein kinase (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Tyrosine phosphorylating activity was originally detected in two viral transforming proteins, but many retroviral transforming proteins and their cellular counterparts have since been shown to possess such activity. The growth factor receptors, which are activated by ligand binding, and the insulin-related peptide receptor, are also family members.

    Proteins where this domain is known:
    PFB0520w   


    PR00114 - STPHPHTASE (Prints link)

    Interpro entry IPR006186 : Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase (Interpro link)

    Interpro description:

    Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.

    Protein phosphatase-1 (PP1) and protein phosphatase-2A (PP2A) have a broad specificity and there are two closely related isoforms of each, alpha and beta. PP2A is a trimeric enzyme that consists of a core composed of a catalytic subunit associated with a 65 kDa regulatory subunit and a third variable subunit. Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin, is composed of two subunits the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. Other serine/threonine specific protein phosphatases that have been characterised include mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V), which are closely related but yet distinct from PP2A; yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties; and Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 which are closely related but yet distinct from PP1.

    Proteins where this domain is known:
    MAL13P1.274    PF08_0129    PF10_0177    PF14_0224    PF14_0630    PFC0595c    PFI1245c    PFI1360c   


    PR00116 - ARGINASE (Prints link)

    Interpro entry IPR005924 : Arginase (Interpro link)

    Interpro description:

    L-Arginine is converted to nitric oxide and citrulline by the enzyme nitric oxide synthase and by the enzyme arginase as a part of the hepatic urea cycle. Arginase is a manganese metalloenzymes containing a metal-activated hydroxide ion, a critical nucleophile in metalloenzymes that catalyze hydrolysis or hydration reactions. A hydrogen bond formed by the metal-bound hydroxide holds the enzyme in the proper orientation for catalysis however nonmetal substrate-binding sites are also implicated in the enzyme mechanism. Regeneration of metal-bound hydroxide ion from a metal-bound water molecule requires proton transfer to bulk solvent mediated by a histidine proton shuttle residue.

    Proteins where this domain is known:
    PFI0320w   


    PR00119 - CATATPASE (Prints link)

    Interpro entry IPR001757 : ATPase, P-type, K/Mg/Cd/Cu/Zn/Na/Ca/Na/H-transporter (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    This entry represents the several classes of P-type ATPases, including those that transport K+, Mg2+, Cd2+, Cu 2+, Zn2+, Na+, Ca2+, Na+/K+, and H+/K+. These P-ATPases are found in both prokaryotes and eukaryotes.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF07_0115    PFI0240c    PFL0590c    PFL1125w   


    PR00121 - NAKATPASE (Prints link)

    Interpro entry IPR006069 : ATPase, P-type cation exchange, alpha subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    This entry represents the alpha subunit found in the P-type cation exchange ATPases found in the plasma membranes of both prokaryotes and eukaryotes. These P-ATPases include both H+/K+-ATPases and Na+/K+-ATPases, which belong to the IIC subfamily of ATPases. These ATPases catalyse the hydrolysis of ATP coupled with the exchange of cations, pumping one cation out of the cell (H+ or Na+) in exchange for K+. These ATPases contain an alpha subunit that is the catalytic component, and a regulatory beta subunit that stabilizes the alpha/beta assembly. Different alpha and beta isoforms exist, permitting greater regulatory control.

    An example of a H+/K+-ATPase is the gastric pump responsible for acid secretion in the stomach, transporting protons from the cytoplasm of parietal cells to create a large pH gradient in exchange for the internalization of potassium ions, using ATP hydrolysis to drive the pump.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PFL0590c   


    PR00122 - VACATPASE (Prints link)

    Interpro entry IPR000245 : ATPase, V0 complex, proteolipid subunit C (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    This entry represents the 16 kDa proteolipid subunit c that is part of the V0 complex of V-ATPase in eukaryotic organelles and in certain bacteria. There are three proteolipid subunits (c, c and cÂÂ) that form part of the proton-conducting pore, each containing a buried glutamic acid residue that is essential for proton transport, and together they form a hexameric ring spanning the membrane.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    MAL13P1.271   


    PR00125 - ATPASEDELTA (Prints link)

    Interpro entry IPR000711 : ATPase, F1 complex, OSCP/delta subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    MAL13P1.47   


    PR00126 - ATPASEGAMMA (Prints link)

    Interpro entry IPR000131 : ATPase, F1 complex, gamma subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0061   


    PR00127 - CLPPROTEASEP (Prints link)

    Interpro entry IPR001907 : Peptidase S14, ClpP (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to the MEROPS peptidase family S14 (ClpP endopeptidase family, clan SK). ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP, although the P subunit alone does possess some catalytic activity. This family of sequences represent the P subunit.

    Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.

    Proteins where this domain is known:
    PF14_0348    PFC0310c   


    PR00141 - PROTEASOME (Prints link)

    Interpro entry IPR000243 : Peptidase T1A, proteasome beta-subunit (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Threonine peptidases are characterised by a threonine nucleophile at the N terminus of the mature enzyme. The threonine peptidases belong to clan PB or are unassigned, clan T-. The type example for this clan is the archaean proteasome beta component of Thermoplasma acidophilum.

    This family of threonine peptidases belong to MEROPS peptidase family T1 (clan PB(T)), subfamily T1A.

    The proteasome (or macropain) is a eukaryotic and archaeal multicatalytic proteinase complex that seems to be involved in an ATP/ubiquitin-dependent nonlysosomal proteolytic pathway. In eukaryotes the proteasome is composed of about 28 distinct subunits which form a highly ordered ring-shaped structure (20S ring) of about 700 kDa. Most proteasome subunits can be classified, on the basis on sequence similarities into two groups, alpha (A) and beta (B). This family contains the beta subunit sequences which range from 190 to 290 amino acids.

    Proteins where this domain is known:
    PF10_0111    PF13_0156   


    PR00143 - CITRTSNTHASE (Prints link)

    Interpro entry IPR002020 : Citrate synthase-like (Interpro link)

    Interpro description:

    Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).

    Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.

    This entry represents types I and II citrate synthase enzymes, as well as the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial produced citrate.

    Proteins where this domain is known:
    PF10_0218   


    PR00148 - ENOLASE (Prints link)

    Interpro entry IPR000941 : Enolase (Interpro link)

    Interpro description:

    Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.

    Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.

    Proteins where this domain is known:
    PF10_0155   


    PR00150 - PEPCARBXLASE (Prints link)

    Interpro entry IPR001449 : Phosphoenolpyruvate carboxylase (Interpro link)

    Interpro description:

    Phosphoenolpyruvate carboxylase (PEPCase), an enzyme found in all multicellular plants, catalyses the formation of oxaloacetate from phosphoenolpyruvate (PEP) and a hydrocarbonate ion. This reaction is harnessed by C4 plants to capture and concentrate carbon dioxide into the photosynthetic bundle sheath cells. It also plays a key role in the nitrogen fixation pathway in legume root nodules: here it functions in concert with glutamine, glutamate and asparagine synthetases and aspartate amido transferase, to synthesise aspartate and asparagine, the major nitrogen transport compounds in various amine-transporting plant species.

    PEPCase also plays an antipleurotic role in bacteria and plant cells, supplying oxaloacetate to the TCA cycle, which requires continuous input of C4 molecules in order to replenish the intermediates removed for amino acid biosynthesis. The C-terminus of the enzyme contains the active site that includes a conserved lysine residue, involved in substrate binding, and other conserved residues important for the catalytic mechanism.

    Proteins where this domain is known:
    PF14_0246   


    PR00153 - CSAPPISMRASE (Prints link)

    Interpro entry IPR002130 : Peptidyl-prolyl cis-trans isomerase, cyclophilin-type (Interpro link)

    Interpro description:

    Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.

  • Note: FKBP's, a family of proteins that bind the immunosuppressive drug FK506, are also PPIases, but their sequence is not at all related to that of cyclophilin.
  • Proteins where this domain is known:
    PF08_0121    PF11_0170    PFE0505w    PFE1430c    PFL0120c    PFL0735w   


    PR00190 - ACTIN (Prints link)

    Interpro entry IPR004000 : Actin/actin-like (Interpro link)

    Interpro description:

    Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.

    In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.

    Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.

    Proteins where this domain is known:
    PF11_0114    PF14_0124    PFA0190c    PFL2215w   


    PR00192 - FACTINCAPB (Prints link)

    Interpro entry IPR001698 : F-actin capping protein, beta subunit (Interpro link)

    Interpro description:

    The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.

    The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.

    The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species.

    Proteins where this domain is known:
    PFE0880c   


    PR00193 - MYOSINHEAVY (Prints link)

    Interpro entry IPR001609 : Myosin head, motor region (Interpro link)

    Interpro description:

    Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy. The 3-D structure of the head portion of myosin has been determined and a model for actin-myosin complex has been constructed.

    The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion.

    Proteins where this domain is known:
    MAL13P1.148    PF11_0416    PFE0175c    PFF0675c   


    PR00195 - DYNAMIN (Prints link)

    Interpro entry IPR001401 : Dynamin, GTPase region (Interpro link)

    Interpro description:

    Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins, and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.

    The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.

    This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.

    Proteins where this domain is known:
    PF10_0368   


    PR00219 - SYNAPTOBREVN (Prints link)

    Interpro entry IPR001388 : Synaptobrevin (Interpro link)

    Interpro description:

    Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles, specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.

    Proteins where this domain is known:
    MAL13P1.135   


    PR00297 - CHAPERONIN10 (Prints link)

    Interpro entry IPR001476 : Chaperonin Cpn10 (Interpro link)

    Interpro description:

    The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

    Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.

    Proteins where this domain is known:
    PFL0740c   


    PR00298 - CHAPERONIN60 (Prints link)

    Interpro entry IPR001844 : (Interpro link)

    Interpro description:

    The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10). Type II chaperonins, found in eukaryotic cytosol and in Archaebacteria, comprise only a cpn60 member.

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner . The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

    The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella spp. bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A.

    Proteins where this domain is known:
    PF10_0153    PFL1545c   


    PR00300 - CLPPROTEASEA (Prints link)

    Interpro entry IPR001270 : Chaperonin clpA/B (Interpro link)

    Interpro description:

    A group of ATP-binding proteins that includes the regulatory subunit of the ATP-dependent protease clpA; heat shock proteins clpB, 104 and 78; and chloroplast proteins CD4a (ClpC) and CD4b belong to this family. The proteins are thought to protect cells from stress by controlling the aggregation and denaturation of vital cellular structures. They vary in size, but share a domain which contains an ATP-binding site.

    These signatures which span the ATP binding region also identify the bacterial DNA polymerase III subunit tau, ATP-dependent protease La and the mitochondrial lon protease homolog, both of which belong to MEROPS peptidase family S16.

    Proteins where this domain is known:
    PF08_0063    PF11_0175    PF11_0405    PF14_0063   


    PR00301 - HEATSHOCK70 (Prints link)

    Interpro entry IPR001023 : Heat shock protein Hsp70 (Interpro link)

    Interpro description:
    A family of heat shock proteins, the hsp70 proteins have an average molecular weight of 70 kDa. In most species, there are many proteins that belong to the hsp70 family. Some of these are only expressed under stress conditions (strictly inducible), while some are present in cells under normal growth conditions and are not heat-inducible (constitutive or cognate). Hsp70 proteins can be found in different cellular compartments (nuclear, cytosolic, mitochondrial, endoplasmic reticulum, etc.).

    Little is known of the function of hsp70 proteins. Some evidence suggests that the constitutive members have a role in the disassembly of clathrin cages, and may also participate in the post-translational transmembrane targetting of proteins to cellular organelles. No specific activities or associations have been found for the inducible members, although it has been suggested that they may accept incoming precursor proteins, keep them unfolded, then pass them on to the hsp60/hsp10 (cpn60/cpn10) complex for folding and assembly.

    Proteins where this domain is known:
    PF07_0033    PF08_0054    PF11_0351    PFI0875w   


    PR00304 - TCOMPLEXTCP1 (Prints link)

    Interpro entry IPR017998 : (Interpro link)

    Interpro description:

    Protein folding is thought to be the sole result of properties inherent in polypeptide primary sequences. Sometimes, however, additional proteins are required to mediate correct folding and subsequent oligomer assembly. These 'helpers', or chaperones, bind to specific protein surfaces, preventing incorrect folding and formation of non-functional structures.

    The tailless complex polypeptide 1 (TCP-1) is a highly structurally conserved molecular chaperone located in the cytosol. The protein has also been shown to bind to Golgi membranes and to microtubules, this latter property suggesting a role in mitotic spindle formation in dividing cells (especially in sperm, where it is highly abundant). TCP-1 forms a double ring structure, similar to the 10kDa and 60kDa chaperonins, with 6-8 subunits per ring. The amino acid sequence is significantly similar to the 60kDa chaperonin, and to TF55, a chaperone from the archaebacterium Sulfolobus shibatae.

    Proteins where this domain is known:
    MAL13P1.283    PF11_0331    PFB0635w    PFC0285c    PFC0350c    PFC0900w    PFF0430w    PFL1425w   


    PR00305 - 1433ZETA (Prints link)

    Interpro entry IPR000308 : 14-3-3 protein (Interpro link)

    Interpro description:

    The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.

    14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

    Proteins where this domain is known:
    MAL13P1.309    MAL8P1.69   


    PR00314 - CLATHRINADPT (Prints link)

    Interpro entry IPR001392 : Clathrin adaptor, mu subunit (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents the mu subunit of various clathrin adaptors (AP1, AP2 and AP3). The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle .

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0202    PF13_0062    PFL0885w   


    PR00315 - ELONGATNFCT (Prints link)

    Interpro entry IPR000795 : Protein synthesis factor, GTP-binding (Interpro link)

    Interpro description:
    Elongation factors belong to a family of proteins that promote the GTP-dependent binding of aminoacyl tRNA to the A site of ribosomes during protein biosynthesis, and catalyse the translocation of the synthesised protein chain from the A to the P site. The proteins are all relatively similar in the vicinity of their C-termini, and are also highly similar to a range of proteins that includes the nodulation Q protein from Rhizobium meliloti (Sinorhizobium meliloti), bacterial tetracycline resistance proteins and the omnipotent suppressor protein 2 from yeast.

    In both prokaryotes and eukaryotes, there are three distinct types of elongation factors, EF-1alpha (EF-Tu), which binds GTP and an aminoacyl-tRNAand delivers the latter to the A site of ribosomes; EF-1beta (EF-Ts), which interacts with EF-1a/EF-Tu to displace GDP and thus allows the regeneration of GTP-EF-1a; and EF-2 (EF-G), which binds GTP and peptidyl-tRNA and translocates the latter from the A site to the P site. In EF-1-alpha, a specific region has been shown to be involved in a conformational change mediated by the hydrolysis of GTP to GDP. This region is conserved in both EF-1alpha/EF-Tu as well as EF-2/EF-G and thus seems typical for GTP-dependent proteins which bind non-initiator tRNAs to the ribosome. The GTP-binding protein synthesis factor family also includes the eukaryotic peptide chain release factor GTP-binding subunits and prokaryotic peptide chain release factor 3 (RF-3); the prokaryotic GTP-binding protein lepA and its homolog in yeast (GUF1) and Caenorhabditis elegans (ZK1236.1); yeast HBS1; rat statin S1; and the prokaryotic selenocysteine-specific elongation factor selB.

    Proteins where this domain is known:
    MAL13P1.164    PF07_0062    PF11_0245    PF13_0304    PF13_0305    PF14_0104    PF14_0486    PFF0115c    PFF0345w    PFI0570w    PFL1590c    PFL1710c   


    PR00320 - GPROTEINBRPT (Prints link)

    Proteins where this domain is known:
    MAL8P1.43    PF08_0019    PF10_0261    PF11_0471    PFC0100c    PFC0365w    PFE0540w    PFF0330w    PFL0970w    PFL1290w    PFL1820w   


    PR00322 - G10 (Prints link)

    Interpro entry IPR001748 : G10 protein (Interpro link)

    Interpro description:
    A Xenopus protein known as G10 has been found to be highly conserved in a wide range of eukaryotic species. The function of G10 is still unknown. G10 is a protein of about 17 to 18 kDa (143 to 157 residues) which is hydrophilic and whose C-terminal half is rich in cysteines and could be involved in metal-binding.

    Proteins where this domain is known:
    PFE1140c   


    PR00326 - GTP1OBG (Prints link)

    Interpro entry IPR006073 : GTP1/OBG (Interpro link)

    Interpro description:

    Several proteins have recently been shown to contain the 5 structural motifs characteristic of GTP-binding proteins. These include murine DRG protein; GTP1 protein from Schizosaccharomyces pombe; OBG protein from Bacillus subtilis; and several others. Although the proteins contain GTP-binding motifs and are similar to each other, they do not share sequence similarity to other GTP-binding proteins, and have thus been classed as a novel group, the GTP1/OBG family. As yet, the functions of these proteins are uncertain, but they have been shown to be important in development and normal cell metabolism.

    Proteins where this domain is known:
    MAL13P1.294    MAL7P1.122    MAL8P1.33    PF11_0143    PF14_0114    PF14_0221    PF14_0339    PFD0710w    PFE1215c    PFF0625w    PFL0835w   


    PR00328 - SAR1GTPBP (Prints link)

    Interpro entry IPR006689 : ARF/SAR superfamily (Interpro link)

    Interpro description:

    The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.

    The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.

    Proteins where this domain is known:
    PF14_0399    PFL2245w   


    PR00348 - UBIQUITIN (Prints link)

    Interpro entry IPR000626 : Ubiquitin (Interpro link)

    Interpro description:

    Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.

    Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.

    Proteins where this domain is known:
    MAL13P1.64    PF13_0346   


    PR00368 - FADPNR (Prints link)

    Interpro entry IPR013027 : (Interpro link)

    Interpro description:
    This entry describes both class I and class II oxidoreductases. FAD flavoproteins belonging to the family of pyridine nucleotide-disulphide oxidoreductases (glutathione reductase, trypanothione reductase, lipoamide dehydrogenase, mercuric reductase, thioredoxin reductase, alkyl hydroperoxide reductase) share sequence similarity with a number of other flavoprotein oxidoreductases, in particular with ferredoxin-NAD+ reductases involved in oxidative metabolism of a variety of hydrocarbons (rubredoxin reductase, putidaredoxin reductase, terpredoxin reductase, ferredoxin-NAD+ reductase components of benzene 1,2-dioxygenase, toluene 1,2-dioxygenase, chlorobenzene dioxygenase, biphenyl dioxygenase), NADH oxidase and NADH peroxidase, . Comparison of the crystal structures of human glutathione reductase and Escherichia coli thioredoxin reductase reveals different locations of their active sites, suggesting that the enzymes diverged from an ancestral FAD/NAD(P)H reductase and acquired their disulphide reductase activities independently.

    Despite functional similarities, oxidoreductases of this family show no sequence similarity with adrenodoxin reductases and flavoprotein pyridine nucleotide cytochrome reductases (FPNCR). Assuming that disulphide reductase activity emerged later, during divergent evolution, the family can be referred to as FAD-dependent pyridine nucleotide reductases, FADPNR.

    To date, 3D structures of glutathione reductase, thioredoxin reductase , mercuric reductase, lipoamide dehydrogenase, trypanothione reductase and NADH peroxidase have been solved. The enzymes share similar tertiary structures based on a doubly-wound alpha/beta fold, but the relative orientations of their FAD- and NAD(P)H-binding domains may vary significantly. By contrast with the FPNCR family, the folds of the FAD- and NAD(P)H-binding domains are similar, suggesting that the domains evolved by gene duplication.

    Proteins where this domain is known:
    PF07_0085    PF08_0066    PF14_0192    PFI0735c   


    PR00369 - FLAVODOXIN (Prints link)

    Interpro entry IPR001094 : Flavodoxin-like (Interpro link)

    Interpro description:
    Flavodoxins act in various electron-transport systems as functional analogues of ferredoxins. Although flavodoxins are found only in certain bacteria and algae the proteins share similarity with a number of protein domains of both prokaryotic and eukaryotic origin .

    Proteins where this domain is known:
    PFI1140w   


    PR00371 - FPNCR (Prints link)

    Interpro entry IPR001709 : Flavoprotein pyridine nucleotide cytochrome reductase (Interpro link)

    Interpro description:

    Flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes include ferredoxin:NADP+reductases (FNR), plant and fungal NAD(P)H:nitrate reductases, NADH:cytochrome b5 reductases, NADPH:P450 reductases, NADPH:sulphite reductases, nitric oxide synthases, phthalate dioxygenase reductase, and various other flavoproteins.

    Despite functional similarities, FPNCRs show no sequence similarity to NADPH:adrenodoxin reductases, nor to bacterial ferredoxin:NAD+reductases and their homologues. To date, 3D-structures of 4 members of the family have been solved: Spinacia oleracea (Spinach) ferredoxin:NADP+ reductase; Burkholderia cepacia (Pseudomonas cepacia) phthalate dioxygenase reductase; the flavoprotein domain of Zea mays (Maize) nitrate reductase; and Sus scrofa (Pig) NADH:cytochrome b5 reductase. In all of them, the FAD-binding domain (N-terminal) has the topology of an anti-parallel beta-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel beta-sheet with 2 helices on each side). In spite of such structural similarities, the level of amino acid identity between family members is at or below the limit of significance (e.g., nitrate reductase is only 15% identical to FNR).

    Proteins where this domain is known:
    PFF1115w   


    PR00380 - KINESINHEAVY (Prints link)

    Interpro entry IPR001752 : Kinesin, motor region (Interpro link)

    Interpro description:

    Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.

    The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.

    A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:

    The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.

    The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.

    Proteins where this domain is known:
    MAL8P1.132    PF07_0104    PF11_0478    PFA0535c    PFC0770c    PFC0860w    PFL0545w    PFL2165w    PFL2190c   


    PR00387 - PDIESTERASE1 (Prints link)

    Interpro entry IPR002073 : 3'5'-cyclic nucleotide phosphodiesterase (Interpro link)

    Interpro description:

    The cyclic nucleotide phosphodiesterases (PDE) comprise a group of enzymes that degrade the phosphodiester bond in the second messenger molecules cAMP and cGMP. They are divided into 11 families. They regulate the localisation, duration and amplitude of cyclic nucleotide signalling within subcellular domains. PDEs are therefore important for signal transduction.

    PDE enzymes are often targets for pharmacological inhibition due to their unique tissue distribution, structural properties, and functional properties. Inhibitors include: Roflumilast for chronic obstructive pulmonary disease and asthma, Sildenafil for erectile dysfunction and Cilostazol for peripheral arterial occlusive disease, amongst others.

    Retinal 3',5'-cGMP phosphodiesterase is located in photoreceptor outer segments: it is light activated, playing a pivotal role in signal transduction. In rod cells, PDE is oligomeric, comprising an alpha-, a beta- and 2 gamma-subunits, while in cones, PDE is a homodimer of alpha chains, which are associated with several smaller subunits. Both rod and cone PDEs catalyse the hydrolysis of cAMP or cGMP to the corresponding nucleoside 5' monophosphates, both enzymes also binding cGMP with high affinity. The cGMP-binding sites are located in the N-terminal half of the protein sequence, while the catalytic core resides in the C-terminal portion.

    Proteins where this domain is known:
    MAL13P1.118    MAL13P1.119    PF14_0672    PFL0475w   


    PR00390 - PHPHLIPASEC (Prints link)

    Interpro entry IPR001192 : Phospholipase C, phosphoinositol-specific, C-terminal (PLC) (Interpro link)

    Interpro description:

    Phosphoinositol-specific phospholipase C (PLC; plays an important role in signal transduction processes, mediating the cellular actions of a variety of hormones, neurotransmitters and growth factors. Upon agonist-dependent activation, PLC catalyses the hydrolysis of membrane phosphatidylinositol 4,5-bisphosphate (PIP2), generating the second messengers inositol 1,4,5-trisphosphate (IP3) and diacylglycerol (DAG). IP3 binds specific intracellular receptors to trigger Ca2+ mobilisation, while DAG mediates activation of a family of protein kinase C isozymes. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins. Based on molecular size, immunoreactivity and amino acid sequence, several subtypes have been classified. Overall, sequence identity between sub-types is low, yet all isoforms share two conserved domains, designated X and Y.

    All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues, for example, in PLC-beta subtypes, X and Y domains are separated by a stretch of 70-120 amino acids rich in Ser, Thr and acidic residues (their C terminus is rich in basic residues). However, in PLC-gammas, there is an insert of more than 400 residues containing a PH domain, two SH2 domains, and one SH3 domain. The two conserved X and Y domains have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain, possibly involved in Ca-dependent membrane attachment. PLCs show little similarity in the 300-residue N-terminal region preceding the X-domain.

    This entry represents a PLC region found towards the C-terminus which contains the X and Y boxes and the Ca2+-dependent membrane-targeting module of these proteins.

    Proteins where this domain is known:
    PF10_0132   


    PR00391 - PITRANSFER (Prints link)

    Interpro entry IPR001666 : Phosphatidylinositol transfer protein (Interpro link)

    Interpro description:
    Phosphatidylinositol transfer protein (PITP) is a ubiquitous cytosolic protein, thought to be involved in transport of phospholipids from their site of synthesis in the endoplasmic reticulum and Golgi to other cell membranes. More recently, PITP has been shown to be an essential component of the polyphosphoinositide synthesis machinery and is hence required for proper signalling by epidermal growth factor and f-Met-Leu-Phe, as well as for exocytosis. The role of PITP in polyphosphoinositide synthesis may also explain its involvement in intracellular vesicular traffic.

    Proteins where this domain is known:
    MAL13P1.256   


    PR00395 - RIBOSOMALS2 (Prints link)

    Interpro entry IPR001865 : Ribosomal protein S2 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal S2 proteins have been shown to belong to a family that includes 40S ribosomal subunit 40kDa proteins, putative laminin-binding proteins, NAB-1 protein and 29.3kDa protein from Haloarcula marismortui. The laminin-receptor proteins are thus predicted to be the eukaryotic homologue of the eubacterial S2 risosomal proteins.

    Proteins where this domain is known:
    PF10_0264   


    PR00405 - REVINTRACTNG (Prints link)

    Interpro entry IPR001164 : Arf GTPase activating protein (Interpro link)

    Interpro description:

    This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.

    The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.

    The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.

    Proteins where this domain is known:
    PF08_0120    PFL2140c   


    PR00406 - CYTB5RDTASE (Prints link)

    Interpro entry IPR001834 : NADH:cytochrome b5 reductase (CBR) (Interpro link)

    Interpro description:

    Flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) catalyse the interchange of reducing equivalents between one-electron carriers and the two-electron-carrying nicotinamide dinucleotides. The enzymes include

    NADH:cytochrome b5 reductase (CBR) serves as electron donor for cytochrome b5, a ubiquitous electron carrier (see, thus participating in a variety of metabolic pathways (including steroid biosynthesis, desaturation and elongation of fatty acids, P450-dependent reactions, methaemoglobin reduction, etc.). A membrane-bound form of CBR is located on the cytosolic side of the endoplasmic reticulum, while a soluble form is found in erythrocytes. In the membrane-bound form, the N-terminal residue is myristoylated. Deficiency of the erythrocyte form causes hereditary methaemoglobinemia.

    In biological nitrate assimilation, reduction of nitrate to nitrite is catalysed by the multidomain redox enzyme NAD(P)H:nitrate reductase (NR). Three forms of NR are known: an NADH-specific enzyme found in higher plants and algae; an NAD(P)H-bispecific enzyme found in higher plants, algae and fungi; and an NADPH-specific enzyme found only in fungi. NR can be divided into 3 structure/function domains: the molybdopterin cofactor binds in the N-terminal domain; the central region is the cytochrome b domain, which is similar to animal cytochrome b5 (see; and the C-terminal portion of the protein is occupied by the FAD/NAD(P)H binding domain, which is similar to CBR. The catalytic reduction of nitrate to nitrite can be viewed as a single polypeptide electron transport chain with electron flow from NAD(P)H -> FAD -> cytochrome b5 -> molybdopterin -> NO(3). Thus, the flavin domain of NR is functionally identical to CBR.

    To date, the 3D-structures of the flavoprotein domain of Zea mays (Maize) nitrate reductase and of Sus scrofa (Pig) NADH:cytochrome b5 reductase have been solved. The overall fold is similar to that of ferredoxin:NADP+ reductase: the FAD-binding domain (N-terminal) has the topology of an anti-parallel beta-barrel, while the NAD(P)-binding domain (C-terminal) has the topology of a classical pyridine dinucleotide-binding fold (i.e. a central parallel beta-sheet flanked by 2 helices on each side).

    Proteins where this domain is known:
    PF13_0353   


    PR00417 - PRTPISMRASEI (Prints link)

    Interpro entry IPR000380 : DNA topoisomerase, type IA, core (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry describes the core region of type IA topoisomerases, which are highly conserved enzymes that are structurally distinct from type IB enzymes. The structures of both topoisomerases I and III have been elucidated, and consist of four domains that together form a toroidal molecule with a central hole that is large enough to accommodate single- and double-stranded DNA. It is believed that the domains transiently separate from one another to allow the entrance and exit of DNA strands.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF13_0251   


    PR00418 - TPI2FAMILY (Prints link)

    Interpro entry IPR001241 : DNA topoisomerase, type IIA, subunit B or N-terminal (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has ATPase and subunit interaction capacity.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316    PFL1915w   


    PR00419 - ADXRDTASE (Prints link)

    Interpro entry IPR000759 : Adrenodoxin reductase (Interpro link)

    Interpro description:
    Mitochondrial P450-containing systems comprise 3 components, an FAD-containing flavoprotein NADPH:adrenodoxin reductase (AR); an iron-sulphur protein, adrenodoxin; and P450. The direction of electron flow is NADPH to AR to adrenodoxin to P450. FAD can be reduced by 2 electrons from NADPH, which are transferred one at a time to adrenodoxin, a one-electron carrier. Both AR and adrenodoxin are soluble proteins located on the matrix side of the inner mitochondrial membrane. Despite functional parallels, AR shows no global similarity either to flavoprotein pyridine nucleotide cytochrome reductases (FPNCR) or to FAD-dependent pyridine nucleotide reductases (FADPNR). However, BLAST searches reveal local similarity of the N-terminal region of AR to glutamate synthase and NADH peroxidase, especially in the nucleotide-binding regions, suggesting that AR and FADPNR may be distantly related.

    Proteins where this domain is known:
    PF11_0407    PF14_0334   


    PR00421 - THIOREDOXIN (Prints link)

    Interpro entry IPR006662 : Thioredoxin-like subdomain (Interpro link)

    Interpro description:

    Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .

    Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.

    A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:

    Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:

    This entry represents the thioredoxin domain and homologous domains in other proteins. The motifs in this signature span an invariant Trp residue, the N-terminal of helix 2 that contains two cysteines that form the redox-active disulphide bond, the fourth beta strand containing and invariant cis-proline.

    Proteins where this domain is known:
    PF13_0272   


    PR00447 - NATRESASSCMP (Prints link)

    Interpro entry IPR001046 : Natural resistance-associated macrophage protein (Interpro link)

    Interpro description:

    The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functionally related proteins defined by a conserved hydrophobic core of ten transmembrane domains. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others. It is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations.

    The natural resistance of mice to infection with intracellular parasites is controlled by the Bcg locus, which modulates the cytostatic/cytocidal activity of phagocytes. Nramp1, the gene responsible, is expressed exclusively in macrophages and poly-morphonuclear leukocytes, and encodes a polypeptide (natural resistance-associated macrophage protein) with features typical of integral membrane proteins. Other transporter proteins from a variety of sources also belong to this family.

    Proteins where this domain is known:
    PFE1185w   


    PR00448 - NSFATTACHMNT (Prints link)

    Interpro entry IPR000744 : NSF attachment protein (Interpro link)

    Interpro description:

    Regulated exocytosis of neurotransmitters and hormones, as well as intracellular traffic, requires fusion of two lipid bilayers. SNARE proteins are thought to form a protein bridge, the SNARE complex, between an incoming vesicle and the acceptor compartment. SNARE proteins contribute to the specificity of membrane fusion, implying that the mechanisms by which SNAREs are targeted to subcellular compartments are important for specific docking and fusion of vesicles. This mechanism involves a family of conserved proteins, members of which appear to function at all sites of constitutive and regulated secretion in eukaryotes. Among them are 2 types of cytosolic protein, NSF (N-ethyl-maleimide-sensitive protein) and the SNAPs (alpha-, beta- and gamma-soluble NSF attachment proteins). The yeast vesicular fusion protein, sec17, a cytoplasmic peripheral membrane protein involved in vesicular transport between the endoplasmic reticulum and the golgi apparatus, shows a high degree of sequence similarity to the alpha-SNAP family.

    SNAP-25 and its non-neuronal homologue Syndet/SNAP-23 are synthesized as soluble proteins in the cytosol. Both SNAP-25 and Syndet/SNAP-23 are palmitoylated at cysteine residues clustered in a loop between two N- and C-terminal coils and palmitoylation is essential for membrane binding and plasma membrane targeting. The C-terminal and the N-terminal helices of SNAP-25, are each targeted to the plasma membrane by two distinct cysteine-rich domains and appear to regulate the availability of SNAP to form complexes with SNARE.

    Proteins where this domain is known:
    PFE0445c   


    PR00449 - RASTRNSFRMNG (Prints link)

    Interpro entry IPR001806 : Ras GTPase (Interpro link)

    Interpro description:

    Many members of the Ras superfamily of GTPases have been implicated in the regulation of hematopoietic cells, with roles in growth, survival, differentiation, cytokine production, chemotaxis, vesicle-trafficking, and phagocytosis. The Ras superfamily of proteins now includes over 150 small GTPases (distinguished from the large, heterotrimeric GTPases, the G-proteins). It comprises six subfamilies, the Ras, Rho, Ran, Rab, Arf, and Kir/Rem/Rad subfamilies. They exhibit remarkable overall amino acid identities, especially in the regions interacting with the guanine nucleotide exchange factors that catalyse their activation.

    Proteins where this domain is known:
    MAL13P1.241    PFA0335w    PFB0500c    PFI0155c    PFL1500w   


    PR00472 - CASNKINASEII (Prints link)

    Interpro entry IPR000704 : Casein kinase II, regulatory subunit (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.

    Proteins where this domain is known:
    PF11_0048    PF13_0232   


    PR00475 - HEXOKINASE (Prints link)

    Interpro entry IPR001312 : Hexokinase (Interpro link)

    Interpro description:

    Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.

    The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.

    The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.

    Proteins where this domain is known:
    PFF1155w   


    PR00476 - PHFRCTKINASE (Prints link)

    Interpro entry IPR000023 : Phosphofructokinase (Interpro link)

    Interpro description:
    The enzyme-catalysed transfer of a phosphoryl group from ATP is an important reaction in a wide variety of biological processes. One enzyme that utilises this reaction is phosphofructokinase (PFK), which catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6- bisphosphate, a key regulatory step in the glycolytic pathway. PFK exists as a homotetramer in bacteria and mammals (where each monomer possesses 2 similar domains), and as an octomer in yeast (where there are 4 alpha- (PFK1) and 4 beta-chains (PFK2), the latter, like the mammalian monomers, possessing 2 similar domains).

    PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.

    Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.

    Proteins where this domain is known:
    PFI0755c   


    PR00477 - PHGLYCKINASE (Prints link)

    Interpro entry IPR001576 : Phosphoglycerate kinase (Interpro link)

    Interpro description:

    Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.

    PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.

    Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.

    This entry represents the full PGK enzyme.

    Proteins where this domain is known:
    PFI1105w   


    PR00481 - LAMNOPPTDASE (Prints link)

    Interpro entry IPR000819 : Peptidase M17, leucyl aminopeptidase, C-terminal (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to the MEROPS peptidase family M17 (leucyl aminopeptidase family, clan MF), the type example being leucyl aminopeptidase from Bos taurus (Bovine).

    Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids.

    The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer. The 2 zinc ions and the active site are entirely located in the C-terminal catalytic domain.

    Proteins where this domain is known:
    PF14_0439   


    PR00501 - KELCHREPEAT (Prints link)

    Interpro entry IPR006651 : (Interpro link)

    Interpro description:

    Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.

    The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.

    This entry represents a kelch sequence motif that comprises one beta-sheet blade.

    Proteins where this domain is known:
    PF13_0238   


    PR00503 - BROMODOMAIN (Prints link)

    Interpro entry IPR001487 : (Interpro link)

    Interpro description:
    Bromodomains are found in a variety of mammalian, invertebrate and yeast DNA-binding proteins. Bromodomains can interact with acetylated lysine. In some proteins, the classical bromodomain has diverged to such an extent that parts of the region are either missing or contain an insertion (e.g., mammalian protein HRX, Caenorhabditis elegans hypothetical protein ZK783.4, yeast protein YTA7). The bromodomain may occur as a single copy, or in duplicate.

    The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.

    Proteins where this domain is known:
    PF10_0328    PF14_0724    PFL0635c    PFL1645w   


    PR00507 - N12N6MTFRASE (Prints link)

    Interpro entry IPR002296 : N6 adenine-specific DNA methyltransferase, N12 class (Interpro link)

    Interpro description:

    In prokaryotes, the major role of DNA methylation is to protect host DNA against degradation by restriction enzymes. There are 2 major classes of DNA methyltransferase that differ in the nature of the modifications they effect. The members of one class (C-MTases) methylate a ring carbon and form C5-methylcytosine (see. Members of the second class (N-MTases) methylate exocyclic nitrogens and form either N4-methylcytosine (N4-MTases) or N6-methyladenine (N6-MTases). Both classes of MTase utilise the cofactor S-adenosyl-L-methionine (SAM) as the methyl donor and are active as monomeric enzymes.

    N-6 adenine-specific DNA methylases (A-Mtase) are enzymes that specifically methylate the amino group at the C-6 position of adenines in DNA. Such enzymes are found in the three existing types of bacterial restriction-modification systems (in type I system the A-Mtase is the product of the hsdM gene, and in type III it is the product of the mod gene). All of these enzymes recognise a specific sequence in DNA and methylate an adenine in that sequence. It has been shown that A-Mtases contain a conserved motif Asp/Asn-Pro-Pro-Tyr/Phe in their N-terminal section, this conserved region could be involved in substrate binding or in the catalytic activity. The structure of N6-MTase TaqI (M.TaqI) has been resolved to 2.4 A. The molecule folds into 2 domains, an N-terminal catalytic domain, which contains the catalytic and cofactor binding sites, and comprises a central 9-stranded beta-sheet, surrounded by 5 helices; and a C-terminal DNA recognition domain, which is formed by 4 small beta-sheets and 8 alpha-helices. The N- and C-terminal domains form a cleft that accommodates the DNA substrate. A classification of N-MTases has been proposed, based on conserved motif (CM) arrangements. According to this classification, N6-MTases that have an NPPY motif (CM II) occuring after the FxGxG motif (CM I) are designated N12 class N6-adenine MTases.

    Proteins where this domain is known:
    PF13_0236   


    PR00599 - MAPEPTIDASE (Prints link)

    Interpro entry IPR001714 : Peptidase M24, methionine aminopeptidase (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamilies M24A and M24B.

    Methionine aminopeptidase (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity.

    Two subfamilies of MAP enzymes are known to exist. While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2. The second subfamily also includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.

    Proteins where this domain is known:
    MAL8P1.140    PF10_0150    PF14_0327    PFE1360c   


    PR00603 - CYTOCHROMEC1 (Prints link)

    Interpro entry IPR002326 : Cytochrome c1 (Interpro link)

    Interpro description:
    Cytochrome bc1 complex (ubiquinol:ferricytochrome c oxidoreductase) is found in mitochondria, photosynthetic bacteria and other prokaryotes. It is minimally composed of three subunits: cytochrome b, carrying a low- and a high-potential haem group; cytochrome c1 (cyt c1); and a high-potential Rieske iron-sulphur protein. The general function of the complex is electron transfer between two mobile redox carriers, ubiquinol and cytochrome c; the electron transfer is coupled with proton translocation across the membrane, thus generating proton-motive force in the form of an electrochemical potential that can drive ATP synthesis. In its structure and functions, the cytochrome bc1 complex bears extensive analogy to the cytochrome b6f complex of chloroplasts and cyanobacteria; cyt c1 plays an analogous role to cytochrome f, in spite of their different structures.

    Proteins where this domain is known:
    PF14_0597   


    PR00604 - CYTCHRMECIAB (Prints link)

    Interpro entry IPR002327 : Cytochrome c, class IA/ IB (Interpro link)

    Interpro description:
    Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes.

    Ambler recognised four classes of cytC.

    Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N-terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C-terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cyt C and prokaryotic 'short' cyt C2 exemplified by Rhodopila globiformis cyt C2; Class IA includes 'long' cyt C2, such as Rhodospirillum rubrum cyt C2 and Aquaspirillum itersonii cyt C-550, which have several extra loops by comparison with Class IB cyt C.

    The 3D structures of a considerable number of class IA and IB cytC have been determined. The proteins consist of 3-6 alpha-helices; the three most conserved 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent. Most class I cytC have conserved aromatic residues clustered around the haem and axial ligands.

    Proteins where this domain is known:
    PF14_0038   


    PR00615 - CCAATSUBUNTA (Prints link)

    Interpro entry IPR003957 : Transcription factor, CBFA/NFYB, DNA topoisomerase (Interpro link)

    Interpro description:

    The CCAAT-binding factor (CBF) is a mammalian transcription factor that binds to a CCAAT motif in the promoters of a wide variety of genes, including type I collagen and albumin. The factor is a heteromeric complex of A and B subunits, both of which are required for DNA-binding. The subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating this interaction.

    The A subunit can be split into 3 domains on the basis of sequence similarity, a non-conserved N-terminal 'A domain'; a highly-conserved central 'B domain' involved in DNA-binding; and a C-terminal 'C domain', which contains a number of glutamine and acidic residues involved in protein-protein interactions. The A subunit shows striking similarity to the HAP3 subunit of the yeast CCAAT-binding heterotrimeric transcription factor. The Kluyveromyces lactis HAP3 protein has been predicted to contain a 4-cysteine zinc finger, which is thought to be present in similar HAP3 and CBF subunit A proteins, in which the third cysteine is replaced by a serine. This family also includes DNA topoisomerase II, which controls the topology of DNA by transient breaking of the strands and rejoining.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF11_0477   


    PR00620 - HISTONEH2A (Prints link)

    Interpro entry IPR002119 : Histone H2A (Interpro link)

    Interpro description:
    Histone H2A is a small, highly conserved nuclear protein that, together with 2 molecules each of histones H2B, H3 and H4, forms the eukaryotic nucleosome core; the nucleosome octamer winds ~146 DNA base-pairs.

    Proteins where this domain is known:
    PFC0920w   


    PR00621 - HISTONEH2B (Prints link)

    Interpro entry IPR000558 : Histone H2B (Interpro link)

    Interpro description:
    Histone H2B is one of the four histones, along with H2A, H3 and H4, which forms the eukaryotic nucleosome core. Histone H2B is a small, highly conserved nuclear protein that, together with 2 molecules each of histones H2A, H3 and H4, forms the eukaryotic nucleosome core; the nucleosome octamer winds ~146 DNA base-pairs.

    Proteins where this domain is known:
    PF07_0054    PF11_0062   


    PR00622 - HISTONEH3 (Prints link)

    Interpro entry IPR000164 : Histone H3 (Interpro link)

    Interpro description:

    Histone H3 is one of the four histones, along with H2A, H2B and H4, which form the eukaryotic nucleosome octomer core; the nucleosome octamer winds ~146 DNA base-pairs. It is a highly conserved protein of 135 amino acid residues.

    Several proteins have been found to contain a C-terminal H3-like domain, including the mammalian centromeric protein CENP-A (which may act as a core histone necessary for the assembly of centromeres); yeast chromatin- associated protein CSE4; and Caenorhabditis elegans chromosome III proteins YL82_CAEEL and YMH3_CAEEL, whose function is unknown.

    Proteins where this domain is known:
    PF13_0185   


    PR00625 - DNAJPROTEIN (Prints link)

    Interpro entry IPR003095 : Heat shock protein DnaJ (Interpro link)

    Interpro description:

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolizing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.

    DnaJ comprises a 70-residue N-terminal domain (the J-domain); a 30-residue glycine-rich region (the G-domain); a central domain containing 4 repeats of a CxxCxGxG motif (the CRR-domain); and a 120-170 residue C-terminal region. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately.

    Proteins where this domain is known:
    PF08_0032    PF08_0115    PF10_0032    PF10_0378    PF10_0381    PF11_0034    PF11_0099    PF11_0380    PF11_0509    PF11_0512    PF11_0513    PF13_0102    PF14_0137    PF14_0359    PFA0660w    PFB0085c    PFB0090c    PFB0920w    PFB0925w    PFD0462w    PFE0055c    PFE1170w    PFF1415c    PFI0935w    PFL0055c   


    PR00633 - RCCNDNSATION (Prints link)

    Interpro entry IPR000408 : (Interpro link)

    Interpro description:

    The regulator of chromosome condensation (RCC1) is a eukaryotic protein which binds to chromatin and interacts with ran, a nuclear GTP-binding protein to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS). The interaction of RCC1 with ran probably plays an important role in the regulation of gene expression.

    RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in Drosophila, is a protein that contains seven tandem repeats of a domain of about 50 to 60 amino acids. As shown in the following schematic representation, the repeats make up the major part of the length of the protein. Outside the repeat region, there is just a small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C-terminal domain of about 130 residues.

    The RCC1-type of repeat is also found in the X-linked retinitis pigmentosa GTPase regulator. The RCC repeats form a beta-propeller structure.

    Proteins where this domain is known:
    MAL7P1.38    PF11_0385    PF13_0303    PFE0420c   


    PR00660 - ERLUMENR (Prints link)

    Interpro entry IPR000133 : ER lumen protein retaining receptor (Interpro link)

    Interpro description:

    Proteins resident in the lumen of the endoplasmic reticulum (ER) contain a C-terminal tetrapeptide, commonly known as Lys-Asp-Glu-Leu (KDEL) in mammals and His-Asp-Glu-Leu (HDEL) in yeast (Saccharomyces cerevisiae) that acts as a signal for their retrieval from subsequent compartments of the secretory pathway. The receptor for this signal is a ~26 kDa Golgi membrane protein, initially identified as the ERD2 gene product in S. cerevisiae. The receptor molecule, known variously as the ER lumen protein retaining receptor or the 'KDEL receptor', is believed to cycle between the cis side of the Golgi apparatus and the ER. It has also been characterised in a number of other species, including plants, Plasmodium, Drosophila and mammals. In mammals, 2 highly related forms of the receptor are known.

    The KDEL receptor is a highly hydrophobic protein of 220 residues; its sequence exhibits 7 hydrophobic regions, all of which have been suggested to traverse the membrane. More recently, however, it has been suggested that only 6 of these regions are transmembrane (TM), resulting in both N- and C-termini on the cytoplasmic side of the membrane.

    Proteins where this domain is known:
    MAL13P1.163   


    PR00662 - G6PISOMERASE (Prints link)

    Interpro entry IPR001672 : Phosphoglucose isomerase (PGI) (Interpro link)

    Interpro description:

    Phosphoglucose isomerase (PGI) is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine.

    PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/beta-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability.

    Proteins where this domain is known:
    PF14_0341   


    PR00679 - PROHIBITIN (Prints link)

    Interpro entry IPR000163 : Prohibitin (Interpro link)

    Interpro description:

    Genes that negatively regulate proliferation inside the cell are of considerable interest because of the implications in processes such as development and cancer. Prohibitin, a novel cytoplasmic anti-proliferative protein widely expressed in a variety of tissues, inhibits DNA synthesis. Studies have suggested that prohibitin may be a suppressor gene and is associated with tumour development and/or progression of at least some breast cancers. Sequence comparisons suggest that the prohibitin gene is an analogue of Cc, a Drosophila melanogaster gene that is vital for normal development.

    Proteins where this domain is known:
    PF08_0006    PF10_0144   


    PR00685 - TIFACTORIIB (Prints link)

    Interpro entry IPR000812 : Transcription factor TFIIB related (Interpro link)

    Interpro description:

    In eukaryotes, transcription initiation by polymerase II is modulated by both general and specific transcription factors. The general factors (which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIG and TFIIH) operate through common promoter elements, such as the TATA box. Transcription factor IIB (TFIIB) is of central importance in transcription of class II genes. It associates with TFIID-TFIIA bound to DNA (the DA complex) to form a ternary TFIID-IIA-IBB (DAB) complex, which is recognised by RNA polymerase II. TFIIB comprises ~315-340 residues and contains an imperfect C-terminal repeat of a 75-residue domain that may contribute to the symmetry of the folded protein.

    Proteins where this domain is known:
    PF14_0469    PFA0525w   


    PR00686 - TIFACTORIID (Prints link)

    Interpro entry IPR000814 : TATA-box binding (Interpro link)

    Interpro description:

    The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins.

    The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins . By contrast, the N-terminal region varies in both length and sequence.

    Proteins where this domain is known:
    PFE0305w   


    PR00689 - ACOABINDINGP (Prints link)

    Interpro entry IPR000582 : Acyl-CoA-binding protein, ACBP (Interpro link)

    Interpro description:

    Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.

    ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.

    Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.

    The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.

    Other proteins containing an ACB domain include:

    Proteins where this domain is known:
    PF08_0099    PF10_0015    PF10_0016    PF14_0749   


    PR00705 - PAPAIN (Prints link)

    Interpro entry IPR000668 : Peptidase C1A, papain C-terminal (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.

    The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.

    The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.

    Proteins where this domain is known:
    PF11_0165    PF14_0553    PFB0325c    PFL2290w   


    PR00707 - UBCTHYDRLASE (Prints link)

    Interpro entry IPR001578 : Peptidase C12, ubiquitin carboxyl-terminal hydrolase 1 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.

    Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.

    The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.

    Proteins where this domain is known:
    PF14_0576   


    PR00721 - STOMATIN (Prints link)

    Interpro entry IPR001972 : Stomatin (Interpro link)

    Interpro description:

    Stomatin is also known as erythrocyte membrane protein band 7.2b. It is a 31 kDa membrane protein, and was named after the rare human disease: haemolytic anaemia hereditary stomatocytosis. The protein contains a single hydrophobic domain, close to the N-terminus, and is phosphorylated.

    Stomatin is believed to be involved in regulating monovalent cation transport through lipid membranes. Absence of the protein in hereditary stomatocytosis is believed to be the reason for the leakage of Na+ and K+ ions into and from erythrocytes.

    A second function of stomatin is to act as a cytoskeletal anchor. One possible example of this is its interaction with some anti-malarial drugs. Current opinion speculates that such drugs bind to high density lipoproteins in serum. The lipoproteins are delivered to erythrocytes, where it is believed they Interact with stomatin as a means of transfer to the intracellular parasite, via a pathway used for the uptake of exogenous phospholipid.

    Stomatin-like proteins have been identified in various organisms, including Caenorhabditis elegans and Mus musculus.

    Proteins where this domain is known:
    PFC0800w   


    PR00723 - SUBTILISIN (Prints link)

    Interpro entry IPR000209 : Peptidase S8 and S53, subtilisin, kexin, sedolisin (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.

    The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.

    The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.

    Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.

    Proteins where this domain is known:
    PF11_0381    PFE0355c    PFE0370c   


    PR00727 - LEADERPTASE (Prints link)

    Interpro entry IPR000223 : Peptidase S26A, signal peptidase I (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A.

    At least 3 eubacterial leader peptidases are known: murein prelipoprotein peptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleaving the leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane. Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric.

    Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, which localise in the inner mitochondrial space. The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad.

    Proteins where this domain is known:
    PF13_0118   


    PR00728 - SIGNALPTASE (Prints link)

    Interpro entry IPR001733 : Peptidase S26B, eukaryotic signal peptidase (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26B.

    Eukaryotic microsomal signal peptidase is involved in the removal of signal peptides from secretory proteins as they pass into the endoplasmic reticulum lumen. The peptidase is more complex than its mitochondrial and bacterial counterparts, containing a number of subunits, ranging from two in the chicken oviduct peptidase, to five in the dog pancreas protein. They share sequence similarity with the bacterial leader peptidases (family S26A), although activity here is mediated by a serine/histidine dyad rather than a serine/lysine dyad. Archaeal signal peptidases also belong to this group.

    Proteins where this domain is known:
    MAL13P1.167   


    PR00773 - GRPEPROTEIN (Prints link)

    Interpro entry IPR000740 : GrpE nucleotide exchange factor (Interpro link)

    Interpro description:

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes the grpE protein. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    The X-ray crystal structure of GrpE in complex with the ATPase domain of DnaK revealed that GrpE is an asymmetric homodimer, bent in a manner that favours extensive contacts with only one DnaKATPase monomer. GrpE does not actively compete for the atomic positions occupied by the nucleotide. GrpE and ADP mutually reduce one another's affinity for DnaK 200-fold, and ATP instantly dissociates GrpE from DnaK.

    Proteins where this domain is known:
    PF11_0258   


    PR00775 - HEATSHOCK90 (Prints link)

    Interpro entry IPR001404 : Heat shock protein Hsp90 (Interpro link)

    Interpro description:

    Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.

    The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.

    Proteins where this domain is known:
    PF11_0188    PF14_0417    PFL1070c   


    PR00789 - OSIALOPTASE (Prints link)

    Interpro entry IPR017861 : Peptidase M22, glycoprotease, subgroup (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to MEROPS peptidase family M22 (clan MK). The type example being O-sialoglycoprotein endopeptidase from Pasteurella haemolytica (Mannheimia haemolytica).

    O-Sialoglycoprotein endopeptidase is secreted by the bacterium P. haemolytica, and digests only proteins that are heavily sialylated, in particular those with sialylated serine and threonine residues. Substrate proteins include glycophorin A and leukocyte surface antigens CD34, CD43, CD44 and CD45. Removal of glycosylation, by treatment with neuraminidase, completely negates susceptibility to O-sialoglycoprotein endopeptidase digestion.

    Sequence similarity searches have revealed other members of the M22 family, from yeast, Mycobacterium, Haemophilus influenzae and the cyanobacterium Synechocystis. The zinc-binding and catalytic residues of this family have not been determined, although the motif HMEGH may be a zinc-binding region.

    Proteins where this domain is known:
    PF10_0299   


    PR00792 - PEPSIN (Prints link)

    Interpro entry IPR001461 : Peptidase A1 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .

    More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation.

    Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event.

    This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.

    Proteins where this domain is known:
    PF14_0075    PF14_0281    PFC0495w   


    PR00799 - TRANSAMINASE (Prints link)

    Interpro entry IPR000796 : Aspartate/other aminotransferase (Interpro link)

    Interpro description:
    Aspartate aminotransferase is important for the metabolism of amino acids and Krebs-cycle related organic acids. In plants, it is involved in nitrogen metabolism and in aspects of carbon and energy metabolism. The enzyme catalyses the reaction:
     L-aspartate + 2-oxoglutarate = oxaloacetate + L-glutamate 
    Aminotransferases share certain mechanistic features with other pyridoxal-phosphate-dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue . This family includes some aromatic-amino-acid aminotransferases too.

    Proteins where this domain is known:
    PFB0200c   


    PR00830 - ENDOLAPTASE (Prints link)

    Interpro entry IPR001984 : Peptidase S16, Lon protease, C-terminal region (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to the MEROPS peptidase family S16 (lon protease family, clan SF). The type example being the lon protease of Escherichia coli.

    Lon (La) protease was the first ATP-dependent protease to be purified from E. coli. The enzyme is a homotetramer of 87kDa subunits, with one proteolytic and one ATP-binding site per monomer, making it structurally less complex than other known ATP-dependent proteases. Despite this relative structural simplicity, lon recognises its substrates directly, without delegating the task of substrate recognition to other enzymes. By contrast, ClpP endopeptidases (S14, clan SK) are multimeric assemblies of two different types of subunit, one of which has ATPase activity, and the other has proteolytic activity.

    Other members of this group include:

    The family also include proteins classified as non-peptidase homologues that either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity. A significant number of the non-peptidase homologues of S16 are found in which are described as Mg chelatase-related proteins.

    Proteins where this domain is known:
    PF14_0147   


    PR00834 - PROTEASES2C (Prints link)

    Interpro entry IPR001940 : Peptidase S1C, HrtA/DegP2/Q/S (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases and non-peptidase homologs belong to the MEROPS peptidase family S1, subfamily S1C (protease Do subfamily, clan PS(S)). A type example is the protease Do from Escherichia coli.

    Other members of this group include the E. coli htrA gene product (HrtA or DegP protein), which is essential for bacterial survival at temperatures above 42 degrees and for digesting misfolded protein in the periplasm. Mature DegP from E. coli has 448 residues, of which His105, Asp135, and Ser210 form the catalytic triad. The protein has an N-terminal sequence typical of a leader peptide. Structural analysis indicates that bacterial HtrA is a serine protease belonging to the family of cage-forming proteases and that only unfolded polypeptides can be threaded in extended conformation into the cage to access the proteolytic sites. Disulphide bonds of partially unfolded substrates impede protein breakdown and represent a conformational constraint for entering the inner cavity. This preference for unfolded polypeptides might be also a reason for the ATP-independent mode of action and for the increased proteolytic activity at higher temperatures.

    The HtrA family shares a modular architecture composed of an N-terminal segment believed to have regulatory functions, a conserved trypsin-like protease domain, and one or two PDZ domains which mediate specific protein-protein interactions and bind preferentially to the C-terminal three to four residues of the target protein. HtrA belongs to the trypsin clan SA. SA proteases have a two-domain structure with each domain forming a six-stranded barrel. The active site cleft is located at the interface of the two perpendicularly arranged barrel domains. The active site is constructed by several loops located at the C-terminal side of both barrel domains. The functional unit of HtrA appears to be a trimer, which is stabilized exclusively by residues of the protease domains. The basic trimer has a funnel-like shape with the protease domains located at its top and the PDZ domains protruding to the outside. Once substrates have been bound, they have to be delivered into the interior of the funnel and the proteolytic sites. In contrast to other protease-chaperone systems, ATP does not drive binding and release of substrates.

    The degQ and degS genes of E. coli encode proteins of 455 and 355 residues that are homologues of the DegP protease. Purified DegQ protein has the properties of a serine endopeptidase, and is processed by the removal of a 27-residue N-terminal signal sequence. Deletion studies suggest that DegQ, like DegP, functions as a periplasmic protease in vivo.

    Proteins where this domain is known:
    MAL8P1.98   


    PR00851 - XRODRMPGMNTB (Prints link)

    Interpro entry IPR001161 : Xeroderma pigmentosum group B protein (XP-B) (Interpro link)

    Interpro description:

    Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    This entry represents XP group B (XP-B) give rise to both XP and Cockayne syndrome. The DNA/RNA helicase domainis also present in this group of proteins.

    Proteins where this domain is known:
    PF10_0369   


    PR00853 - XPGRADSUPER (Prints link)

    Interpro entry IPR006084 : DNA repair protein (XPGC)/yeast Rad (Interpro link)

    Interpro description:

    Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    Proteins where this domain is known:
    PF07_0105    PF10_0080    PFB0265c   


    PR00868 - DNAPOLI (Prints link)

    Interpro entry IPR002298 : DNA polymerase A (Interpro link)

    Interpro description:

    DNA carries the biological information that instructs cells how to exist in an ordered fashion. Accurate replication is thus one of the most important events in the cell life cycle. This function is mediated by DNA-directed DNA polymerases, which add nucleotide triphosphate (dNTP) residues to the 3'-end of the growing DNA chain, using a complementary DNA as template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used. DNA-dependent DNA polymerases have been grouped into families, denoted A, B and X, on the basis of sequence similarities. Members of family A, which includes bacterial and bacteriophage polymerases, share significant similarity to Escherichia coli polymerase I; hence family A is also known as the pol I family. The bacterial polymerases also contain an exonuclease activity, which is coded for in the N-terminal portion. Three motifs, A, B and C, are seen to be conserved across all DNA polymerases, with motifs A and C also seen in RNA polymerases. They are centred on invariant residues, and their structural significance was implied from the Klenow (E. coli) structure. Motif A contains a strictly-conserved aspartate at the junction of a beta-strand and an alpha-helix; motif B contains an alpha-helix with positive charges; and motif C has a doublet of negative charges, located in a beta-turn-beta secondary structure.

    Proteins where this domain is known:
    PF14_0112    PFF1225c   


    PR00881 - L7ARS6FAMILY (Prints link)

    Interpro entry IPR018492 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The genomic structure and sequence of the human ribosomal protein L7a has been determined and shown to resemble other mammalian ribosomal protein genes. The sequence of a gene for ribosomal protein L4 of yeast has also been determined; its single open reading frame is highly similar to mammalian ribosomal protein L7a. Several other ribosomal proteins have been found to share sequence similarity with L7a, including Saccharomyces cerevisiae NHP2, Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui Hs6, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1203.

    Proteins where this domain is known:
    PF11_0250    PF14_0231    PFD0960c   


    PR00882 - RIBOSOMALL7A (Prints link)

    Interpro entry IPR001921 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The genomic structure and sequence of the human ribosomal protein L7a has been determined. The gene contains 8 exons and 7 introns, encompassing 3179 bp. The human gene resembles other mammalian ribosomal protein genes in so far as it contains a short first exon, a short 5' untranslated leader and its transcriptional start sites at C residues embedded in a poly-pyrimidine tract.

    The sequence of a gene for ribosomal protein L4 of Saccharomyces cerevisiae (Baker's yeast) has also been determined, which, unlike most of its other ribosomal protein genes, has no intron. The single open reading frame is highly similar to mammalian ribosomal protein L7a.

    There appear to be two genes for L4, both of which are active. Yeast cells containing a disruption of the L4-1 gene form smaller colonies than either wild-type or disrupted L4-2 strains. Disruption of both L4 genes is lethal, probably resulting from an inability of the organism to produce functional ribosomes.

    Several other ribosomal proteins have been found to share sequence similarity with L7a, including yeast NHP2, Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui (Halobacterium marismortui) Hs6, and Methanocaldococcus jannaschii MJ1203.

    This InterPro entry focus on regions that characterise the ribosomal L7A proteins but distinguish them from the rest of the HMG-like family.

    Proteins where this domain is known:
    PF14_0231   


    PR00883 - NUCLEARHMG (Prints link)

    Interpro entry IPR002415 : (Interpro link)

    Interpro description:
    The high mobility group (HMG)-like nuclear protein NHP2 from Saccharomyces cerevisiae has been termed 'HMG-like' in that it shares certain physical/chemical properties with HMG proteins from higher eukaryotes. It shows no significant sequence similarity to such proteins, and thus constitutes a distinct HMG protein class. NHP2 does share sequence similarity with two ribosomal proteins: the acidic ribosomal protein S6 from Haloarcula marismortui and mammalian L7a ribosomal protein.

    The biological implications of the observed similarities to S6 and L7a are unclear, as biochemical studies have indicated that NHP2 is not a ribosomal protein. Nevertheless, deletion experiments have indicated NHP2 to have an essential physiological function.

    Proteins where this domain is known:
    PF11_0250   


    PR00886 - HIGHMOBLTY12 (Prints link)

    Interpro entry IPR000135 : High mobility group, HMG1/HMG2, subgroup (Interpro link)

    Interpro description:

    High mobility group (HMG or HMGB) proteins constitute a family of relatively low molecular weight non-histone components in chromatin. HMG1 and HMG2 are highly similar, and preferentially bind single-stranded DNA and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.

    The 3D structure of part of the sequence (57-136), termed box 2, has been determined using 3D NMR. The protein exhibits an unusual all-alpha fold, which forms a V-shaped arrow-head, with helices along two edges and one rather flat face. Such an architecture is not shown by any of the currently known DNA-binding motifs. The majority of conserved residues in the HMG box family are those involved in maintaining the 3D fold.

    Proteins where this domain is known:
    MAL8P1.72   


    PR00887 - SSRCOGNITION (Prints link)

    Interpro entry IPR000969 : Structure-specific recognition protein (Interpro link)

    Interpro description:
    Human structure-specific recognition protein, SSRP1, binds specifically to DNA modified with the anti-cancer drug cisplatin. An 81 kD protein is predicted, containing several highly-charged domains and a stretch of 75 residues that share 47% identity with a portion of the high mobility group (HMG) protein HMG1. This HMG box probably constitutes the structure recognition element for cisplatin-modified DNA, the probable recognition motif being the local duplex unwinding and bending that occurs on formation of intra-strand cross-links. SSRP1 is the human homologue of a recently identified mouse protein that binds to recombination signal sequences. These sequences have been postulated to form stem-loop structures, further implicating local bends and unwinding in DNA as a recognition target for HMG-box proteins. A Drosophila melanogaster cDNA encoding an HMG-box-containing protein has also been isolated. This protein shares 50% sequence identity with human SSRP1. In vitro binding studies using Drosophila SSRP showed that the protein binds to single-stranded DNA and RNA, with highest affinity for nucleotides G and U. Comparison of the predicted amino acid sequences among SSRP family members reveals 48% identity, with structural conservation in the C-terminus of the HMG box, as well as domains of highly charged residues. The most highly conserved regions lie in the poorly understood N-terminus, suggesting that this portion of the protein is critical for its function.

    This entry contains Pob3 which is a subunit of the heterodimeric yeast FACT complex (Spt16p-Pob3p). The FACT complex facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.

    Proteins where this domain is known:
    PF14_0393   


    PR00891 - RABGDIREP (Prints link)

    Interpro entry IPR002005 : Rab GTPase activator (Interpro link)

    Interpro description:
    Rab proteins constitute a family of small GTPases that serve a regulatory role in vesicular membrane traffic; C-terminal geranylgeranylation is crucial for their membrane association and function. This post-translational modification is catalysed by Rab geranylgeranyl transferase (Rab-GGTase), a multi-subunit enzyme that contains a catalytic heterodimer and an accessory component, termed Rab escort protein (REP)-1. REP-1 presents newly- synthesised Rab proteins to the catalytic component, and forms a stable complex with the prenylated proteins following the transfer reaction.

    The mechanism of REP-1-mediated membrane association of Rab5 is similar to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI also share other functional properties, including the ability to inhibit the release of GDP and to remove Rab proteins from membranes.

    The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.

    The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.

    Proteins where this domain is known:
    PF10_0373    PFL2060c   


    PR00892 - RABGDI (Prints link)

    Interpro entry IPR000806 : Rab GDI protein (Interpro link)

    Interpro description:
    Rab proteins, a family of small Ras-related GTP-binding proteins, are involved in regulation of intracellular vesicle trafficking. Rab GDP dissociation inhibitor (GDI) forms a soluble complex with Rab proteins, thereby preventing exchange of GDP for GTP. Rab GDI exists in several isoforms, and belongs to the TCD/MRS6 family of GDP dissociation inhibitors.

    The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.

    The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.

    Proteins where this domain is known:
    PFL2060c   


    PR00926 - MITOCARRIER (Prints link)

    Interpro entry IPR002067 : Mitochondrial carrier protein (Interpro link)

    Interpro description:

    A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane. Such proteins include: ADP/ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others.

    Sequence analysis of selected members of the carrier protein family has suggested the presence of six transmembrane (TM) domains, with varying degrees of sequence conservation and hydrophilicity. The TM regions, and adjacent hydrophilic loops, are more highly conserved than other regions of the proteins. All members of the family appear to consist of a tripartite structure, each of the repeated segments being about 100 residues in length. Each repeat contains two TM domains, the first being more hydrophobic, with conserved glycyl and prolyl residues. Five of the six TM domains are followed by the conserved sequence (D/E)-Hy(K/R) {where - denotes any residue, and Hy is a hydrophobic position}.

    Proteins where this domain is known:
    PF10_0051    PF10_0366    PF13_0359    PFD0367w    PFI0425w   


    PR00927 - ADPTRNSLCASE (Prints link)

    Interpro entry IPR002113 : Adenine nucleotide translocator 1 (Interpro link)

    Interpro description:

    A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane. Such proteins include: ADP,ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others.

    Sequence analysis of selected members of the carrier protein family has suggested the presence of six transmembrane (TM) domains, with varying degrees of sequence conservation and hydrophilicity. The TM regions, and adjacent hydrophilic loops, are more highly conserved than other regions of the proteins. All members of the family appear to consist of a tripartite structure, each of the repeated segments being ~100 residues in length. Each repeat contains two TM domains, the first being more hydrophobic, with conserved glycyl and prolyl residues. Five of the six TM domains are followed by the conserved sequence (D/E)-Hy(K/R){where - denotes any residue, and Hy is a hydrophobic position}.

    Mitochondrial ADP/ATP translocase, an abundant component of the inner membrane, carries ATP from the matrix into the inter-membrane space and transports ADP back. The protein is an integral membrane protein that functions as a homodimer.

    Proteins where this domain is known:
    PF10_0366   


    PR00932 - AMINO1PTASE (Prints link)

    Interpro entry IPR001948 : Peptidase M18, aminopeptidase I (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to the MEROPS peptidase family M18, (clan MH). The proteins have two catalytic zinc ions at the active site, bound by His/Asp, Asp, Glu, Asp/Glu and His. The catalysed reaction involves the release of an N-terminal aminoacid, usually neutral or hydrophobic, from a polypeptide.

    The type example is aminopeptidase I from Saccharomyces cerevisiae (Baker's yeast), the sequence of which has been deduced, and the mature protein shown to consist of 469 amino acids. A 45-residue presequence contains both positively- and negatively-charged and hydrophobic residues, which could be arranged in an N-terminal amphiphilic alpha-helix. The presequence differs from signal sequences that direct proteins across bacterial plasma membranes and endoplasmic reticulum or into mitochondria. It is unclear how this unique presequence targets aminopeptidase I to yeast vacuoles, and how this sorting utilises classical protein secretory pathways.

    Proteins where this domain is known:
    PFI1570c   


    PR00945 - HGRDTASE (Prints link)

    Interpro entry IPR000815 : Mercuric reductase (Interpro link)

    Interpro description:
    Proteins that transport heavy metals in micro-organisms and mammals share similarities in their sequences and structures.

    A conserved 30-residue domain has been found in a number of these heavy metal transport or detoxification proteins. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding. The HMA domain has been identified in the N-terminal regions of a variety of cation-transporting ATPases (E1-E2 ATPases). In addition, the domain has been found in bacterial mercuric reductase; the copP copper-binding protein of Helicobacter pylori; and in the N-terminal regions of mercuric transport protein periplasmic component (gene merP) and plasmids carried by mercury-resistant Gram-negative bacteria, where it seems to be a mercury scavenger that specifically binds to one Hg(2+) ion, passing this to mercuric reductase via the merT protein.

    The structure of the mercuric ion-binding protein MerP from Shigella flexneri has been determined. The fold has been classed as a ferredoxin-like alpha-beta sandwich, having a beta-alpha beta-beta alpha-beta architecture, with the two alpha-helices overlaying a four-stranded anti-parallel beta- sheet. Structural differences between the reduced and mercury-bound forms of merP are localised to the metal-binding loop containing the consensus sequence GMTCXXC, the two cysteines of which are involved in bi-coordination of Hg(2+).

    Mercuric reductase, which contains a single copy of the HMA domain, is involved in a specialised system that confers resistance to Hg(2+) on catalysing the reaction:

     Hg + NADP+ + H+ = Hg2+ + NADPH 
    The protein functions as a homodimer, with an FAD flavoprotein; its active site is a redox-active disulphide bond.

    Proteins where this domain is known:
    PF14_0192   


    PR00961 - HUDSXLRNA (Prints link)

    Interpro entry IPR002343 : Paraneoplastic encephalomyelitis antigen (Interpro link)

    Interpro description:
    Many eukaryotic proteins that are either known or thought to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This region has been found in, for example, heterogeneous nuclear ribonucleoproteins, small nuclear ribonucleoproteins, pre-RNA and mRNA associated proteins, Drosophila sex determination and elav proteins, human paraneoplastic encephalomyelitis antigen HuD, and many others. The structure of an RNA-binding domain of Drosophila Sex-lethal (Sxl) protein has been determined using multi-dimensional hetero-nuclear NMR. Sxl contains two RNP consensus-type RNA-binding domains (RBDs) - the determined structure represents the second of these (RBD-2). The calculated intermediate-resolution family of structures exhibits the beta-alpha-beta/beta-alpha-beta tertiary fold found in other RBD-containing proteins.

    Proteins where this domain is known:
    PF13_0315    PF14_0096   


    PR00972 - RIBSOMALS12E (Prints link)

    Interpro entry IPR000530 : Ribosomal protein S12e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The small ribosomal subunit protein S12 contains 130-150 amino acid residues, and is thought to be involved in the translation initiation step. This family consists of eukaryotic S12 ribosomal proteins, including those from vertebrates, Trypanosoma brucei, Caenorhabditis elegans, Drosophila and Saccharomyces cerevisiae (Baker's yeast).

    Proteins where this domain is known:
    PFC0295c   


    PR00973 - RIBOSOMALS17 (Prints link)

    Interpro entry IPR000266 : Ribosomal protein S17 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S17 is known to bind specifically to the 5' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence has revealed that S17 has virtually no groups exposed on the ribosomal surface.

    Proteins where this domain is known:
    PFC0775w   


    PR00975 - RIBOSOMALS19 (Prints link)

    Interpro entry IPR002222 : Ribosomal protein S19/S15 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).

    Proteins where this domain is known:
    MAL13P1.92   


    PR00980 - TRNASYNTHALA (Prints link)

    Interpro entry IPR018164 : Alanyl-tRNA synthetase, class IIc, N-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Alanyl-tRNA synthetase is an alpha4 tetramer that belongs to class IIc.

    Proteins where this domain is known:
    PF13_0354   


    PR00981 - TRNASYNTHSER (Prints link)

    Interpro entry IPR018156 : Seryl-tRNA synthetase, class IIa, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Seryl-tRNA synthetase exists as monomer and belongs to class IIa.

    Proteins where this domain is known:
    PF07_0073    PFL0770w   


    PR00982 - TRNASYNTHLYS (Prints link)

    Interpro entry IPR018149 : Lysyl-tRNA synthetase, class II, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Lysyl-tRNA synthetase is an alpha 2 homodimer that belong to both class I and class II. In eubacteria and eukaryota lysyl-tRNA synthetases belong to class II in the same family as aspartyl tRNA synthetase. The class Ic lysyl-tRNA synthetase family is present in archaea and some eubacteria. Moreover in some eubacteria there is a gene X, which is similar to a part of lysyl-tRNA synthetase from class II. Lysyl-tRNA synthetase is duplicated in some species with, for example in Escherichia coli, as a constitutive gene (lysS) and an induced one (lysU). No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. Lysine is activated by being attached to the alpha-phosphate of AMP before being transferred to the cognate tRNA. The refined crystal structures give "snapshots" of the active site corresponding to key steps in the aminoacylation reaction and provide the structural framework for understanding the mechanism of lysine activation. The active site of LysU is shaped to position the substrates for the nucleophilic attack of the lysine carboxylate on the ATP alpha-phosphate. No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. A loop close to the catalytic pocket, disordered in the lysine-bound structure, becomes ordered upon adenine binding.

    Proteins where this domain is known:
    PF13_0262    PF14_0166   


    PR00983 - TRNASYNTHCYS (Prints link)

    Interpro entry IPR015803 : Cysteinyl-tRNA synthetase, class Ia, N-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Cysteinyl-tRNA synthetase is an alpha monomer and belongs to class Ia.

    Proteins where this domain is known:
    PF10_0149   


    PR00984 - TRNASYNTHILE (Prints link)

    Interpro entry IPR015905 : (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Isoleucyl-tRNA synthetase is an alpha monomer that belongs to class Ia. The enzyme, isoleucyl-transfer RNA synthetase, activates not only the cognate substrate L-isoleucine but also the minimally distinct L-valine in the first, aminoacylation step. Then, in a second, "editing" step, the synthetase itself rapidly hydrolyzes only the valylated products as shown from the crystal structures.

    Proteins where this domain is known:
    PF13_0179   


    PR00985 - TRNASYNTHLEU (Prints link)

    Interpro entry IPR002302 : Leucyl-tRNA synthetase, class Ia, bacterial/mitochondrial (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Leucyl tRNA synthetase is an alpha monomer that belongs to class Ia. There are two different families of leucyl-tRNA synthetases. This family includes the eubacterial and mitochondrial synthetases. The crystal structure of leucyl-tRNA synthetase from the hyperthermophile Thermus thermophilus has an overall architecture that is similar to that of isoleucyl-tRNA synthetase, except that the putative editing domain is inserted at a different position in the primary structure. This feature is unique to prokaryote-like leucyl-tRNA synthetases, as is the presence of a novel additional flexibly inserted domain.

    Proteins where this domain is known:
    PF08_0011   


    PR00986 - TRNASYNTHVAL (Prints link)

    Interpro entry IPR002303 : Valyl-tRNA synthetase, class Ia (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Valyl-tRNA synthetase is an alpha monomer that belongs to class Ia.

    Proteins where this domain is known:
    PF14_0589    PFC0470w   


    PR00987 - TRNASYNTHGLU (Prints link)

    Interpro entry IPR000924 : Glutamyl/glutaminyl-tRNA synthetase, class Ic (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Glutamyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure.

    Proteins where this domain is known:
    MAL13P1.281    PF13_0170    PF13_0257   


    PR01001 - FADG3PDH (Prints link)

    Interpro entry IPR000447 : FAD-dependent glycerol-3-phosphate dehydrogenase (Interpro link)

    Interpro description:
    FAD-dependent glycerol-3-phosphate dehydrogenase (G3PDH; catalyzes the conversion of glycerol-3-phosphate into dihydroxyacetone phosphate:
      sn-glycerol-3-phosphate + acceptor = glycerone phosphate + reduced acceptor 
    Insulin exposure often stimulates G3PDH activity, and thus is key to reducing the effects of the disease diabetes. In obese people, where insulin resistance has been demonstrated, the amount of G3PDH has been shown to be correspondingly lower than that in normal weight people. In bacteria it is associated with the utilization of glycerol coupled to respiration. In Escherichia coli and Haemophilus influenzae, two isozymes are known: one expressed under anaerobic conditions (gene glpA) and one in aerobic conditions (gene glpD). In eukaryotes, a mitochondrial form of GPD participates in the glycerol phosphate shuttle in conjunction with an NAD-dependent cytoplasmic GPD. This mechanism is responsible for the preservation of a redox balance. In this environment, the enzyme has been recorded to increase activity in the presence of calcium. These enzymes are proteins of about 60 to 70 Kd which contain a probable FAD-binding domain in their N-terminal extremity. The mammalian enzyme differs from the bacterial or yeast proteins by having an EF-hand calcium-binding region in its C-terminal extremity.

    Proteins where this domain is known:
    PFC0275w   


    PR01034 - RIBOSOMALS12 (Prints link)

    Interpro entry IPR005679 : Ribosomal protein S12, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This family consists of ribosomal protein S12 from bacteria, mitochondria, and chloroplasts.

    Proteins where this domain is known:
    PFD0600c   


    PR01035 - TCRTETA (Prints link)

    Interpro entry IPR001958 : Tetracycline resistance protein, TetA (Interpro link)

    Interpro description:

    The antibiotic tetracycline has a broad spectrum of activity, acting to inhibit bacterial protein synthesis by binding to the 30S ribosomal subunit, which prevents the association of the aminoacyl-tRNA to the ribosomal acceptor A site. Tetracycline binding is reversible, therefore diluting out the antibiotic can reverse its effects. Tetracycline resistance genes are often located on mobile elements, such as plasmids, transposons and/or conjugative transposons, which can sometimes be transferred between bacterial species. In certain cases, tetracycline can enhance the transfer of these elements, thereby promoting resistance amongst a bacterial colony. There are three types of tetracycline resistance: tetracycline efflux, ribosomal protection, and tetracycline modification:

    The expression of several of these tet genes is controlled by a family of tetracycline transcriptional regulators known as TetR. TetR family regulators are involved in the transcriptional control of multidrug efflux pumps, pathways for the biosynthesis of antibiotics, response to osmotic stress and toxic chemicals, control of catabolic pathways, differentiation processes, and pathogenicity. The TetR proteins identified in over 115 genera of bacteria and archaea share a common helix-turn-helix (HTH) structure in their DNA-binding domain. However, TetR proteins can work in different ways: they can bind a target operator directly to exert their effect (e.g. TetR binds Tet(A) gene to repress it in the absence of tetracycline), or they can be involved in complex regulatory cascades in which the TetR protein can either be modulated by another regulator or TetR can trigger the cellular response.

    This entry represents the tetracycline resistance protein Tet(A), a tetracycline efflux protein that functions as a metal-tetracycline/H+ antiporter. This is an energy-dependent process that decreases the accumulation of the antibiotic in whole cells. Tet(A) is encoded by the transposon Tn10, and is an integral membrane protein with twelve potential transmembrane domains. Site-directed mutagenesis studies have shown that a negative charge at position 66 is essential for tetracycline transport, and that the region that includes the dipeptide plays an important role in metal-tetracycline transport; it perhaps acts as a gate that opens on the charge-charge interaction between Asp66 and the metal-tetracycline.

    Proteins where this domain is known:
    PFE0825w   


    PR01038 - TRNASYNTHARG (Prints link)

    Interpro entry IPR015945 : Arginyl-tRNA synthetase, class Ic, core (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    This entry represents the core region of arginyl-tRNA synthetase, which has been crystallized and preliminary X-ray crystallographic analysis of yeast arginyl-tRNA synthetase-yeast tRNAArg complexes is available.

    Proteins where this domain is known:
    PFI0680c    PFL0900c   


    PR01039 - TRNASYNTHTRP (Prints link)

    Interpro entry IPR002306 : Tryptophanyl-tRNA synthetase, class Ib (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Tryptophanyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. The crystal structure of tryptophanyl-tRNA synthetase is known.

    Proteins where this domain is known:
    PF13_0205   


    PR01040 - TRNASYNTHTYR (Prints link)

    Interpro entry IPR002307 : Tyrosyl-tRNA synthetase, class Ib, bacterial/mitochondrial (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Tyrosyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. Studies on tyrosyl-tRNA synthetase provide the first kinetic evidence that the 'KMSKS' motif plays a role in the initial binding of tRNA(Tyr) to tyrosyl-tRNA synthetase.

    Proteins where this domain is known:
    PF11_0181   


    PR01041 - TRNASYNTHMET (Prints link)

    Interpro entry IPR014758 : Methionyl-tRNA synthetase, class Ia, N-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Methionyl-tRNA synthetase is an alpha 2 dimer that belongs to class Ia. In some species (archaea, eubacteria and eukaryotes) a coding sequence, similar to the C-terminal end of MetRS, is present as an independent gene which is a tRNA binding domain as a dimer. In eubacteria, MetRS can also be split in two sub-classes corresponding to the presence of one or two CXXC domains specific to zinc binding. The crystal structures of a number of methionyl-tRNA synthases are known .

    Proteins where this domain is known:
    PF10_0053    PF10_0340   


    PR01042 - TRNASYNTHASP (Prints link)

    Interpro entry IPR002312 : Aspartyl-tRNA synthetase, class IIb (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Aspartyl tRNA synthetaseis an alpha2 dimer that belongs to class IIb. Structural analysis combined with mutagenesis and enzymology data on the yeast enzyme point to a tRNA binding process that starts by a recognition event between the tRNA anticodon loop and the synthetase anticodon binding module.

    Proteins where this domain is known:
    PFA0145c    PFB0525w    PFE0475w    PFE0715w   


    PR01046 - TRNASYNTHPRO (Prints link)

    Interpro entry IPR002316 : Prolyl-tRNA synthetase, class IIa, conserved region (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Prolyl-tRNA synthetase exists in two forms, which are loosely related. The first form, is present in the majority of eubacteria species. The second one, present in some eubacteria, is essentially present in archaea and eukaryota. Prolyl-tRNA synthetase belongs to class IIa. The enzyme from Escherichia coli contains all three of the conserved consensus motifs characteristic of class II aminoacyl-tRNA synthetases. The complex between Thermus thermophilus prolyl-tRNA synthetase (ProRSTT) and its cognate tRNA has been crystallized using two different isoacceptors of tRNA(Pro).

    Proteins where this domain is known:
    PFL0670c   


    PR01047 - TRNASYNTHTHR (Prints link)

    Interpro entry IPR018158 : Threonyl-tRNA synthetase, class IIa, conserved region (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Threonyl-tRNA synthetase exists as a monomer and belongs to class IIa. The enzyme from Escherichia coli represses the translation of its own mRNA. The crystal structure of the complex between tRNA(Thr) and ThrRS show structural features that reveal novel strategies for providing specificity in tRNA selection. These include an amino-terminal domain containing a novel protein fold that makes minor groove contacts with the tRNA acceptor stem. The enzyme induces a large deformation of the anticodon loop, resulting in an interaction between two adjacent anticodon bases, which accounts for their prominent role in tRNA identity and translational regulation. A zinc ion found in the active site is implicated in amino acid recognition/discrimination. The zinc ion may act to ensure that only amino acids that possess a hydroxyl group attached to the beta-position are activated.

    Proteins where this domain is known:
    PF11_0270   


    PR01050 - PYRUVTKNASE (Prints link)

    Interpro entry IPR015793 : Pyruvate kinase, barrel (Interpro link)

    Interpro description:

    Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

     ADP + phosphoenolpyruvate = ATP + pyruvate 

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    This entry represents the two barrel domains, the beta/alpha-barrel, and the beta-barrel inserted within it.

    Proteins where this domain is known:
    PFF1300w   


    PR01099 - HYETHTZKNASE (Prints link)

    Interpro entry IPR000417 : Hydroxyethylthiazole kinase (Interpro link)

    Interpro description:
    Thiamine pyrophosphate (TPP), a required cofactor for many enzymes in the cell, is synthesised de novo in Salmonella typhimurium. Five kinase activities have been implicated in TPP synthesis, which involves joining a 4-methyl-5-(beta-hydroxyethyl)thiazole (THZ) moiety and a 4-amino-5- hydroxymethyl-2-methylpyrimidine (HMP) moiety. THZ kinase activity is involved in the salvage synthesis of TH-P from the thiazole:
     2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphate 
    Hydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate.

    Proteins where this domain is known:
    PFL1920c   


    PR01158 - TOPISMRASEII (Prints link)

    Interpro entry IPR001154 : DNA topoisomerase II, eukaryotic-type (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    This entry represents DNA topoisomerase II enzymes from eukaryotes and viruses. Topoisomerase II primarily functions in introducing negative supercoils into DNA, and is of particular importance during the segregation of chromosomes during mitosis . In eukaryotes and viruses, this enzyme occurs as a single polypeptide, with the N-terminal portion (homologous to subunit B of bacterial topoisomerase II, or gyraseB) responsible for ATPase activity and the C-terminal portion (homologous to subunit A of bacterial topoisomerase II, or gyraseA) responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges. In mammals, there are at least two isozymes of this enzyme, topoisomerases II-alpha and II-beta, which are similar in structure and catalytic properties. The alpha isoform is involved in chromosome condensation and segregation.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316   


    PR01159 - DNAGYRASEB (Prints link)

    Interpro entry IPR000565 : DNA topoisomerase, type IIA, subunit B (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents subunit B found in topoisomerase II (gyrB) and topoisomerase IV (parE), primarily of bacterial origin, and which functions in ATP hydrolysis and subunit interaction. It does not include the topoisomerase II enzymes composed of a single polypeptide, as are found in most eukaryotes.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PFL1915w   


    PR01161 - TUBULIN (Prints link)

    Interpro entry IPR000217 : Tubulin (Interpro link)

    Interpro description:

    Microtubules are polymers of tubulin, a dimer of two 55-kDa subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end.

    For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.

    Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly.

    Proteins where this domain is known:
    PF08_0125    PF14_0725   


    PR01164 - GAMMATUBULIN (Prints link)

    Interpro entry IPR002454 : Gamma tubulin (Interpro link)

    Interpro description:
    Microtubules are polymers of tubulin, a dimer of two 55 kD subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end. For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.

    Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma-tubulins constitute a ubiquitous and highly conserved subfamily of the tubulin family. The protein is found at microtubule-organising centres, such as the spindle poles or the centrosome. It remains associated with the centrosome when microtubules are depolymerised, suggesting that it is an integral component that might play a role in minus-end nucleation of microtubule assembly.

    Proteins where this domain is known:
    PF08_0125   


    PR01166 - CYCOXIDASEII (Prints link)

    Interpro entry IPR002429 : Cytochrome c oxidase subunit II C-terminal (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).

    Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.

    Proteins where this domain is known:
    PF13_0327    PF14_0288   


    PR01179 - ODADCRBXLASE (Prints link)

    Interpro entry IPR000183 : Orn/DAP/Arg decarboxylase 2 (Interpro link)

    Interpro description:
    These enzymes are collectively known as group IV decarboxylases. Pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates can be classified into two different families on the basis of sequence similarities. Members of this family while most probably evolutionary related, do not share extensive regions of sequence similarities. The proteins contain a conserved lysine residue which is known, in mouse ODC, to be the site of attachment of the pyridoxal-phosphate group. The proteins also contain a stretch of three consecutive glycine residues and has been proposed to be part of a substrate- binding region.

    Proteins where this domain is known:
    PF10_0322   


    PR01182 - ORNDCRBXLASE (Prints link)

    Interpro entry IPR002433 : Ornithine decarboxylase (Interpro link)

    Interpro description:
    These enzymes are collectively known as group IV decarboxylases. Pyridoxal-dependent decarboxylases acting on ornithine, lysine, arginine and related substrates can be classified into two different families on the basis of sequence similarities. Members of this family while most probably evolutionary related, do not share extensive regions of sequence similarities. The proteins contain a conserved lysine residue which is known, in mouse ODC, to be the site of attachment of the pyridoxal-phosphate group. The proteins also contain a stretch of three consecutive glycine residues and has been proposed to be part of a substrate- binding region.

    The ornithine decarboxylases catalyse the transformation of ornithine into putrescine. Phylogenetic analysis of the mRNAs from several mammalian species suggests that ODC is encoded by orthologous genes in the different species. Analysis of divergence patterns in a number of subregions showed that the domains have evolved in a noncoordinate fashion. Evolution of each subregion has been episodic, with periods of both rapid and slow divergence, possibly indicating the existence of selection pressures that were exerted in a time- and domain-specific manner during mammalian speciation. The active form of mammalian ODC is a homodimer of 53 kDa subunits (the monomer retains no enzymatic activity). In vitro hybridisation and cross- linkage analysis have suggested that the active site of ODC is formed at the interface of the two monomers via the interaction of the cysteine-360- containing region of one subunit with the lysine-69-containing region of the other.

    Proteins where this domain is known:
    PF10_0322   


    PR01183 - RIBORDTASEM1 (Prints link)

    Interpro entry IPR000788 : Ribonucleotide reductase large subunit, C-terminal (Interpro link)

    Interpro description:

    Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.

    Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain.

    The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals, the function of each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction process by hydrogen atom abstraction from the ribonucleotide. The radical-based reaction involves five cysteines: two of these are located at adjacent anti-parallel strands in a new type of ten-stranded alpha/beta-barrel; two others reside at the carboxyl end in a flexible arm; and the fifth, in a loop in the centre of the barrel, is positioned to initiate the radical reaction. There are several regions of similarity in the sequence of the large chain of prokaryotes, eukaryotes and viruses spread across 3 domains: an N-terminal domain common to the mammalian and bacterial enzymes; a C-terminal domain common to the mammalian and viral ribonucleotide reductases; and a central domain common to all three.

    Proteins where this domain is known:
    PF14_0352   


    PR01224 - DELTATUBULIN (Prints link)

    Interpro entry IPR002967 : Delta tubulin (Interpro link)

    Interpro description:
    Microtubules are polymers of tubulin, a dimer of two 55 kD subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end. For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site. Most species, excepting simple eukaryotes, express a variety of closely-related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly. More recently, a new delta-type tubulin has been identified in Chlamydomonas reinhardtii and Mus musculus (Mouse), and is likely to be found in a number of other species.

    Proteins where this domain is known:
    PFI1635w   


    PR01233 - JOSEPHIN (Prints link)

    Interpro entry IPR006155 : (Interpro link)

    Interpro description:
    Human genes containing triplet repeats can markedly expand in length, leading to neuropsychiatric disease. Expansion of triplet repeats explains the phenomenon of anticipation, i.e. the increasing severity or earlier age of onset in successive generations in a pedigree. A novel gene containing CAG repeats has been identified and mapped to chromosome 14q32.1, the genetic locus for Machado-Joseph disease (MJD). Normally, the gene contains 13-36 CAG repeats, but most clinically diagnosed patients and all affected members of a family with the clinical and pathological diagnosis of MJD show expansion of the repeat number, from 68-79. Similar abnormalities in related genes may give rise to diseases similar to MJD. MJD is a neurodegenerative disorder characterised by cerebellar ataxia, pyramidal and extra-pyramidal signs, peripheral nerve palsy, external ophtalmoplegia, facial and lingual fasciculation and bulging. The disease is autosomal dominant, with late onset of symptoms, generally after the fourth decade.

    Proteins where this domain is known:
    PFL1295w   


    PR01243 - NUCDPKINASE (Prints link)

    Interpro entry IPR001564 : Nucleoside diphosphate kinase, core (Interpro link)

    Interpro description:

    Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.

    In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.

    NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.

    NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.

    Proteins where this domain is known:
    PF13_0349    PFF0275c   


    PR01250 - RIBOSOMALL34 (Prints link)

    Interpro entry IPR008195 : Ribosomal protein L34e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31, plant L34, yeast putative ribosomal protein YIL052c and archaebacterial L34e.

    Proteins where this domain is known:
    PF07_0043   


    PR01270 - HDASUPER (Prints link)

    Interpro entry IPR000286 : (Interpro link)

    Interpro description:
    Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal of the acetyl group. Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are all members of this ancient protein superfamily.

    Proteins where this domain is known:
    PF10_0078   


    PR01405 - TETRPHPHTASE (Prints link)

    Interpro entry IPR003565 : Bis(5'-nucleosyl)-tetraphosphatase (Interpro link)

    Interpro description:
    MutT is a small bacterial protein (~12-15kDa) involved in the GO system responsible for removing an oxidatively damaged form of guanine (8-hydroxy- guanine or 7,8-dihydro-8-oxoguanine) from DNA and the nucleotide pool. 8-oxo-dGTP is inserted opposite dA and dC residues of template DNA with near equal efficiency, leading to A.T to G.C transversions. MutT specifically degrades 8-oxo-dGTP to the monophosphate, with the concomitant release of pyrophosphate. A short conserved N-terminal region of mutT (designated the MutT domain) is also found in a variety of other prokaryotic, viral, and eukaryotic proteins. Recently, the generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linked to some other moeity X) has been coined for this domain family.

    The enzyme diadenosine 5',5''-P1,P4-tetraphosphate pyrophosphohydrolase asymmetrically hydrolyses AP4A to yield AMP and ATP. The catalysed reaction is as follows:

            P(1),P(4)-bis(5'-adenosyl)tetraphosphate + H(2)O = ATP + AMP.

    The cDNA and derived amino acid sequence of human diadenosine 5',5"'- P1,P4-tetraphosphate pyrophosphohydrolase have been determined by means of EST analysis. The protein possesses a modification of the MutT domain found in certain nucleotide pyrophosphatases.

    Proteins where this domain is known:
    PFE1035c   


    PR01415 - ANKYRIN (Prints link)

    Proteins where this domain is known:
    PF10_0102   


    PR01519 - EPSLNTUBULIN (Prints link)

    Interpro entry IPR004057 : Epsilon tubulin (Interpro link)

    Interpro description:

    Microtubules are polymers of tubulin, a dimer of two 55-kDa subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end.

    For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.

    Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly. More recently, epsilon-tubulin has been identified in humans and Trypanosomes; in humans, it has been localised to centrosomes.

    Proteins where this domain is known:
    PF14_0725   


    PR01546 - YEAST73DUF (Prints link)

    Interpro entry IPR004353 : (Interpro link)

    Interpro description:

    Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologues to domains of known structure that assemble into cargo vesicle adapter (AP) complexes.

    Proteins where this domain is known:
    PF13_0274   


    PR01550 - TOP6AFAMILY (Prints link)

    Interpro entry IPR002815 : Spo11/DNA topoisomerase VI, subunit A (Interpro link)

    Interpro description:

    This entry represents Spo11, a meiotic recombination protein found in eukaryotes, and subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea. These two types of proteins share structural homology.

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.

    Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PFL0825c   


    PR01576 - PDEFORMYLASE (Prints link)

    Interpro entry IPR000181 : Formylmethionine deformylase (Interpro link)

    Interpro description:

    Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::

     N-formyl-L-methionine + H2O = formate + methionyl peptide 
    Catalytic efficiency strongly depends on the identity of the bound metal.

    The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .

    Proteins where this domain is known:
    PFI0380c   


    PR01653 - TCTPROTEIN (Prints link)

    Interpro entry IPR018105 : (Interpro link)

    Interpro description:

    Mammalian translationally controlled tumour protein (TCTP) (or P23) is a protein which has been found to be preferentially synthesised in cells during the early growth phase of some types of tumour, but which is also expressed in normal cells. The physiological function of TCTP is still not known. It was first identified as a histamine-releasing factor, acting in IgE +-dependent allergic reactions. In addition, TCTP has been shown to bind to tubulin in the cytoskeleton, has a high affinity for calcium, is the binding target for the antimalarial compound artemisinin, and is induced in vitamin D-dependent apoptosis. TCTP production is thought to be controlled at the translational as well as the transcriptional level.

    TCTP is a hydrophilic protein of 18 to 20 Kd. TCTPs do not share significant sequence similarity with any other class of proteins. Recently, the structure of TCTP was determined and exhibited significant structural similarity to the human protein Mss4, which is a guanine nucleotide-free chaperone of the Rab protein. Close homologues have been found in plants, earthworm, Caenorhabditis elegans (F52H2.11), Hydra, Saccharomyces cerevisiae (YKL056c) and Schizosaccharomyces pombe (SpAC1F12.02c).

    Proteins where this domain is known:
    PFE0545c   


    PR01657 - MCMFAMILY (Prints link)

    Interpro entry IPR001208 : DNA-dependent ATPase MCM (Interpro link)

    Interpro description:

    MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.

    This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.

    The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.

    The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.

    Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.

    Proteins where this domain is known:
    PF07_0023    PF13_0095    PF13_0291    PF14_0177    PFD0790c    PFE1345c    PFL0560c   


    PR01658 - MCMPROTEIN2 (Prints link)

    Interpro entry IPR008045 : MCM protein 2 (Interpro link)

    Interpro description:

    The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.

    The components of the MCM2-7 complex in Homo sapiens (Human) are:

    .

    Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.

    MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.

    The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.

    In addition to its role in initiation of DNA replication, MCM2 is able to inhibit the MCM4,6,7 helicase. Studies on murine MCM2 indicate that its C-terminus is required for interaction with MCM4, as well as for inhibition of the DNA helicase activity of the MCM4,6,7 complex. The N-terminal region, which contains an H3-binding domain and a region required for nuclear localisation, is required for the phosphorylation by CDC7 kinase.

    Proteins where this domain is known:
    PF14_0177   


    PR01659 - MCMPROTEIN3 (Prints link)

    Interpro entry IPR008046 : MCM protein 3 (Interpro link)

    Interpro description:

    The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.

    The components of the MCM2-7 complex in Homo sapiens (Human) are:

    .

    Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.

    MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.

    The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.

    Members of the MCM3 class have been isolated from a number of organisms. Human MCM3 was first described as a protein associated with DNA polymerase alpha-primase, although subsequent analysis failed to show a direct interaction between the them. The gene encoding human MCM3 has been localised to chromosome 6p21.1-p12. In Saccharomyces cerevisiae (Baker's yeast), MCM3 is a phospho-protein that exists in multiple isoforms; distinct isoforms can be detected at specific stages of the cell cycle. MCM3 has been implicated in limb development in Xenopus; identification of maternal and zygotic proteins suggests that specific forms may be used at different developmental stages. The MCM3 protein contains a nuclear localisation signal, which is necessary for its translocation into the nucleus.

    Proteins where this domain is known:
    PFE1345c   


    PR01660 - MCMPROTEIN4 (Prints link)

    Interpro entry IPR008047 : MCM protein 4 (Interpro link)

    Interpro description:

    The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.

    The components of the MCM2-7 complex in Homo sapiens (Human) are:

    .

    Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.

    MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.

    The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.

    MCM4 is thought to play a pivotal role in ensuring DNA replication occurs only once per cell cycle. Phosphorylation of MCM4 dramatically reduces its affinity for chromatin - it has been proposed that this cell cycle-dependent phosphorylation is the mechanism that inactivates the MCM complex from late S phase through mitosis, thus preventing illegitimate DNA replication during that period of the cell cycle.

    Proteins where this domain is known:
    PF13_0095   


    PR01662 - MCMPROTEIN6 (Prints link)

    Interpro entry IPR008049 : MCM protein 6 (Interpro link)

    Interpro description:

    The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.

    The components of the MCM2-7 complex in Homo sapiens (Human) are:

    .

    Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.

    MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.

    The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.

    In addition to its role as a replication factor, the MCM6 protein has DNA helicase activity when complexed as a hexamer (containing two molecules each of MCM4, MCM6 and MCM7), suggesting that this complex is involved in the initiation of DNA replication as a DNA-unwinding enzyme. Xenopus MCM6 exists in two forms, maternal and zygotic, suggesting that specific forms of MCM6 may be used at different developmental stages.

    Proteins where this domain is known:
    PF13_0291   


    PR01738 - RNABINDINGM8 (Prints link)

    Interpro entry IPR008111 : RNA binding motif protein 8 (Interpro link)

    Interpro description:

    RNA-binding motif protein 8 (RBM8) contains a putative RNA-binding domain known as an RNA recognition motif (RRM). The RRM motif is found in numerous RNA-binding proteins, including heterogenous nuclear ribonucleoproteins (hnRNPs), and proteins implicated in regulation of alternative splicing. The RRM is a 90-residue domain that binds single-stranded RNA; the structure consists of four beta-stands and two alpha-helices arranged in an alpha/beta sandwich, with a third helix present in some cases during RNA binding. Three-dimensional modelling of the RBM8 RRM domain indicates that the sequences fold into an RNA-binding domain, forming a hydrophobic core between a beta-sheet and two helices.

    The human RBM8A protein is ubiquitously expressed; the protein is localised predominantly in the cell nucleus and diffused throughout the cytoplasm. It preferentially associates with mRNAs produced by splicing, including both nuclear mRNAs and newly exported cytoplasmic mRNAs. Evidence suggests the protein remains associated with spliced mRNAs as a tag to indicate the position of spliced introns. Human RBM8A protein specicially binds to MAGOH, the human homologue of Drosophila mago nashi, a protein required for normal germ plasm development in the Drosophila embryo; a similar association occurs with the Drosophila RBM8 protein, Tsunagi.

    The RBM8A and RBM8B protein sequences contain a putative bipartate nuclear localisation signal at the N-terminus, as well a stretch of glycine residues. In addition, the RRM contained within RBM8A and RBM8B contains one set of the two consensus nucleic acid-binding motifs, RNP-1 and RNP-2, characteristic of heterogeneous nuclear ribonucleoprotein (hnRNP).

    Proteins where this domain is known:
    PF14_0057   


    PR01798 - SCOASYNTHASE (Prints link)

    Interpro entry IPR005810 : Succinyl-CoA ligase, alpha subunit (Interpro link)

    Interpro description:

    There are four different enzymes that share a similar catalytic mechanism which involves the phosphorylation by ATP (or GTP) of a specific histidine residue in the active site. These enzymes are: ATP citrate-lyase, the primary enzyme responsible for the synthesis of cytosolic acetyl-CoA in many tissues, catalyzes the formation of acetyl-CoA and oxaloacetate from citrate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. ATP-citrate lyase is a tetramer of identical subunits; Succinyl-CoA ligase (GDP-forming) is a mitochondrial enzyme that catalyzes the substrate level phosphorylation step of the tricarboxylic acid cycle: the formation of succinyl-CoA from succinate with a concomitant hydrolysis of GTP to GDP and phosphate. This enzyme is a dimer composed of an alpha and a beta subunits; Succinyl-CoA ligase (ADP-forming) is a bacterial enzyme that during aerobic metabolism functions in the citric acid cycle, coupling the hydrolysis of succinyl-CoA to the synthesis of ATP. It can also function in the other direction for anabolic purposes. This enzyme is a tetramer composed of two alpha and two beta subunits; and Malate-CoA ligase (malyl-CoA synthetase), is a bacterial enzyme that forms malyl-CoA from malate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. Malate-CoA ligase is composed of two different subunits.

    This entry corresponds to two regions, a glycine-rich conserved region, located in the second half of ATP citrate lyase and in the alpha subunits of succinyl-CoA ligases and malate-CoA ligase; and the active site phosphorylated histidine residue, which is located some 50 residues to the C-terminal of the first region.

    Proteins where this domain is known:
    PF11_0097   


    PR01799 - SFASSEMBLIN (Prints link)

    Interpro entry IPR008374 : SF-assemblin (Interpro link)

    Interpro description:

    Striated fibre assemblin (SFA), an acidic 33kDa protein, is the major component of striated microtubule-associated fibres (SMAFs) in the flagellar basal apparatus of green flagellates. In Chlamydomonas, and other green flagellates, the SMAFs form a cross-like pattern and run alongside the proximal parts of four bundles of flagellar root microtubules.

    The sequence of SFA contains two structurally distinct domains. The head domain, with ~30 residues, contains all the prolines (3-8 depending on species) and is rich in hydroxyamino acids. This non-helical domain is further characterised by the presence of repetitive SP-motifs, some of them in the context SP(M/T)R, which is a putative substrate for p34-CDC2 kinase. The rod domain, with ~250 residues, is predicted to be mostly alpha- helical (the alpha-helix content was estimated to be 76% for the entire molecule or 85% for the postulated rod domain). This domain shows a pronounced coiled-coil-forming ability and contains a 29-residue repeat pattern based on four heptads, followed by a skip residue.

    Proteins where this domain is known:
    MAL8P1.146    PF14_0311   


    PR01839 - RAD23PROTEIN (Prints link)

    Interpro entry IPR014761 : (Interpro link)

    Interpro description:

    All proteins in this family for which functions are known are components of a multiprotein complex used for targeting nucleotide excision repair to specific parts of the genome. Rad23 contains a ubiquitin-like domain that interacts with catalytically active proteasomes and two ubiquitin (Ub)-associated (UBA) sequences that bind Ub. Rad23 interacts with ubiquitinated cellular proteins through the synergistic action of its UBA domains.

    In humans, Rad23 complexes with the XPC protein.

    Proteins where this domain is known:
    PF10_0114   


    PR01848 - U2AUXFACTOR (Prints link)

    Interpro entry IPR009145 : U2 auxiliary factor small subunit (Interpro link)

    Interpro description:

    The U2 small nuclear ribonucleoprotein auxiliary factor (U2AF) is a heterodimeric splicing factor composed of a large and a small subunit. The large U2AF subunit recognises the intronic polypyrimidine tract, a sequence located adjacent to the 3' splice site that serves as an important signal for both constitutive and regulated pre-mRNA splicing. The small subunit interacts with the 3' splice site dinucleotide AG and is essential for regulated splicing. The subunits shuttle continuously between the nucleus and the cytoplasm via a mechanism that involves carrier receptors and is independent of binding to mRNA. Both subunits contain an arginine/ serine-rich (RS) domain, which acts as a nuclear localisation signal. Furthermore, the presence of an RS domain on either subunit is sufficient to trigger the nucleocytoplasmic import of the heterodimeric complex.

    The human form of the U2 auxiliary factor small subunit, hU2AF35, contains a degenerate RNA recognition motif (RRM) and a C-terminal RS domain. The murine form has been shown to be genomically imprinted with monoallelic expression from the paternal allele. However, this is not the case in humans.

    Proteins where this domain is known:
    PF11_0200   


    PR01849 - UBIQUITINACT (Prints link)

    Interpro entry IPR000011 : Ubiquitin-activating enzyme, E1-like (Interpro link)

    Interpro description:

    The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme is responsible for activating ubiquitin, the first step in ubiquitinylation. The E1 enzyme hydrolyses ATP and adenylates the C-terminal glycine residue of ubiquitin, and then links this residue to the active site cysteine of E1, yielding a ubiquitin-thioester and free AMP. To be fully active, E1 must non-covalently bind to and adenylate a second ubiquitin molecule. The E1 enzyme can then transfer the thioester-linked ubiquitin molecule to a cysteine residue on the ubiquitin-conjugating enzyme, E2, in an ATP-dependent reaction.

    Proteins where this domain is known:
    PF13_0182    PFL1245w   


    PR01868 - ABCEFAMILY (Prints link)

    Interpro entry IPR013283 : (Interpro link)

    Interpro description:

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.

    The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    This entry represents the ABCE family of ATP-binding cassette (ABC) transporters and solely comprises of the ABCE1 gene product, a 68kDa polypeptide found in insect cells and multi- cellular eukaryotes, but not in yeast. ABCE1 contains 2 nucleotide-binding domains (NBDs) typical of the ABC transporter protein superfamily; however, it lacks the transmembrane domains required for membrane transport functions. ABCE1 is an endoribonuclease inhibitor that interacts directly with RNase L to prevent it from binding 2-5A (5'-phosphorylated 2',5'-linked oligo- adenylates). RNase L plays a major role in the anti-viral and anti-proliferative activities of interferons, and its inhibition by ABCE1 occurs in a concentration-dependent manner. Recently, ABCE1 has been shown to be essential for the assembly of immature HIV-1 capsids in insect cells and higher eukaryotic cell types. ABCE1 expression is induced during HIV type I infection, and is understood to bind HIV-1 Gag (p55) polypeptides following their translation, and to promote their assembly into immature HIV-1 capsids,,.

    Proteins where this domain is known:
    MAL13P1.344   


    PS01031 - HSP20 (Prosite link)

    Interpro entry IPR002068 : (Interpro link)

    Interpro description:

    Prokaryotic and eukaryotic organisms respond to heat shock or other environmental stress by inducing the synthesis of proteins collectively known as heat-shock proteins (hsp). Amongst them is a family of proteins with an average molecular weight of 20 Kd, known as the hsp20 proteins. These seem to act as chaperones that can protect other proteins against heat-induced denaturation and aggregation. Hsp20 proteins seem to form large heterooligomeric aggregates. Structurally, this family is characterised by the presence of a conserved C-terminal domain of about 100 residues.

    Proteins where this domain is known:
    MAL8P1.78    PF13_0021    PFL0550w   


    PS50003 - PH_DOMAIN (Prosite link)

    Interpro entry IPR001849 : (Interpro link)

    Interpro description:

    The 'pleckstrin homology' (PH) domain is a domain of about 100 residues that occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton.

    The function of this domain is not clear, several putative functions have been suggested:

  • binding to the beta/gamma subunit of heterotrimeric G proteins,
  • binding to lipids, e.g. phosphatidylinositol-4,5-bisphosphate,
  • binding to phosphorylated Ser/Thr residues,
  • attachment to membranes by an unknown mechanism.
  • It is possible that different PH domains have totally different ligand requirements.

    The 3D structure of several PH domains has been determined. All known cases have a common structure consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic helix. The loops connecting the beta-strands differ greatly in length, making the PH domain relatively difficult to detect. There are no totally invariant residues within the PH domain.

    Proteins reported to contain one more PH domains belong to the following families:

    Proteins where this domain is known:
    MAL13P1.306    PF10_0189    PF11_0242    PF11_0327    PFB0257c   


    PS50004 - C2 (Prosite link)

    Interpro entry IPR018029 : (Interpro link)

    Interpro description:
    The C2 domain is a Ca2+-dependent membrane-targeting module found in many cellular proteins involved in signal transduction or membrane trafficking. C2 domains are unique among membrane targeting domains in that they show wide range of lipid selectivity for the major components of cell membranes, including phosphatidylserine and phosphatidylcholine. This C2 domain is about 116 amino-acid residues and is located between the two copies of the C1 domain in Protein Kinase C (that bind phorbol esters and diacylglycerol) (see and the protein kinase catalytic domain (see. Regions with significant homology to the C2-domain have been found in many proteins. The C2 domain is thought to be involved in calcium-dependent phospholipid binding and in membrane targetting processes such as subcellular localisation.

    The 3D structure of the C2 domain of synaptotagmin has been reported, the domain forms an eight-stranded beta sandwich constructed around a conserved 4-stranded motif, designated a C2 key. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel beta sandwich.

    Proteins where this domain is known:
    MAL8P1.134    PF14_0530   


    PS50005 - TPR (Prosite link)

    Interpro entry IPR013026 : (Interpro link)

    Interpro description:

    The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins. It mediates proteinÂprotein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3Â16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.

    The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helixÂturnÂhelix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24° within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.

    Proteins where this domain is known:
    MAL13P1.139    MAL13P1.274    MAL13P1.52    PF07_0026    PF11_0124    PF13_0190    PF13_0231    PF14_0031    PF14_0098    PF14_0324    PFF0080c    PFF1505w    PFI1060w    PFL0615w    PFL2015w    PFL2120w    PFL2275c   


    PS50006 - FHA_DOMAIN (Prosite link)

    Interpro entry IPR000253 : (Interpro link)

    Interpro description:

    The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.

    To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.

    Proteins where this domain is known:
    MAL13P1.405    PF11_0347    PF13_0042    PFI0470w    PFL0275w   


    PS50007 - PIPLC_X_DOMAIN (Prosite link)

    Interpro entry IPR000909 : Phospholipase C, phosphatidylinositol-specific , X region (Interpro link)

    Interpro description:
    Phosphatidylinositol-specific phospholipase C, a eukaryotic intracellular enzyme, plays an important role in signal transduction processes. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins. In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC. All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as the 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. By profile analysis, we could show that sequences with significant similarity to the X-box domain occur also in prokaryotic and trypanosome PI-specific phospholipases C. Apart from this region, the prokaryotic enzymes show no similarity to their eukaryotic counterparts.

    Proteins where this domain is known:
    PF10_0132    PF14_0060   


    PS50008 - PIPLC_Y_DOMAIN (Prosite link)

    Interpro entry IPR001711 : Phospholipase C, phosphatidylinositol-specific, Y domain (Interpro link)

    Interpro description:

    Phosphatidylinositol-specific phospholipase C, an eukaryotic intracellular enzyme, plays an important role in signal transduction processes (see. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins.

    In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.

    All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' (see and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see possibly involved in Ca-dependent membrane attachment.

    Proteins where this domain is known:
    PF10_0132   


    PS50011 - PROTEIN_KINASE_DOM (Prosite link)

    Interpro entry IPR000719 : Protein kinase, core (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Eukaryotic protein kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme. This entry includes protein kinases from eukaryotes and viruses and may include some bacterial hits too.

    Proteins where this domain is known:
    MAL13P1.114    MAL13P1.185    MAL13P1.196    MAL13P1.278    MAL13P1.279    MAL13P1.84    MAL7P1.100    MAL7P1.127    MAL7P1.132    MAL7P1.144    MAL7P1.175    MAL7P1.18    MAL7P1.26    MAL7P1.73    MAL7P1.91    MAL8P1.203    MAL8P1.42    PF07_0072    PF08_0044    PF10_0141    PF10_0160    PF10_0380    PF11_0060    PF11_0079    PF11_0096    PF11_0127    PF11_0147    PF11_0156    PF11_0220    PF11_0227    PF11_0239    PF11_0242    PF11_0377    PF11_0464    PF11_0488    PF11_0510    PF13_0085    PF13_0166    PF13_0211    PF13_0258    PF14_0227    PF14_0264    PF14_0294    PF14_0320    PF14_0346    PF14_0392    PF14_0408    PF14_0423    PF14_0431    PF14_0476    PF14_0516    PF14_0715    PF14_0734    PFA0130c    PFA0380w    PFB0150c    PFB0520w    PFB0605w    PFB0665w    PFB0815w    PFC0060c    PFC0105w    PFC0385c    PFC0420w    PFC0485w    PFC0525c    PFC0755c    PFC0945w    PFD0740w    PFD0865c    PFD1165w    PFD1175w    PFE0045c    PFE1290w    PFF0260w    PFF0520w    PFF0750w    PFF1145c    PFF1370w    PFI0095c    PFI0100c    PFI0105c    PFI0110c    PFI0115c    PFI0120c    PFI0125c    PFI1275w    PFI1280c    PFI1290w    PFI1415w    PFI1685w    PFL0040c    PFL0080c    PFL1370w    PFL1885c    PFL2250c    PFL2280w   


    PS50012 - RCC1_3 (Prosite link)

    Interpro entry IPR000408 : (Interpro link)

    Interpro description:

    The regulator of chromosome condensation (RCC1) is a eukaryotic protein which binds to chromatin and interacts with ran, a nuclear GTP-binding protein to promote the loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS). The interaction of RCC1 with ran probably plays an important role in the regulation of gene expression.

    RCC1, known as PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in Drosophila, is a protein that contains seven tandem repeats of a domain of about 50 to 60 amino acids. As shown in the following schematic representation, the repeats make up the major part of the length of the protein. Outside the repeat region, there is just a small N-terminal domain of about 40 to 50 residues and, in the Drosophila protein only, a C-terminal domain of about 130 residues.

    The RCC1-type of repeat is also found in the X-linked retinitis pigmentosa GTPase regulator. The RCC repeats form a beta-propeller structure.

    Proteins where this domain is known:
    MAL7P1.38    PF11_0385    PF13_0303    PFD0900w    PFE0420c    PFI0975c    PFI1500w    PFL0975w   


    PS50013 - CHROMO_2 (Prosite link)

    Interpro entry IPR000953 : Chromo domain (Interpro link)

    Interpro description:
    The CHROMO (CHRromatin Organization MOdifier) domain is a conserved region of around 60 amino acids, originally identified in Drosophila modifiers of variegation. These are proteins that alter the structure of chromatin to the condensed morphology of heterochromatin, a cytologically visible condition where gene expression is repressed. In one of these proteins, Polycomb, the chromo domain has been shown to be important for chromatin targeting. Proteins that contain a chromo domain appear to fall into 3 classes. The first class includes proteins having an N-terminal chromo domain followed by a region termed the chromo shadow domain, eg. Drosophila and human heterochromatin protein Su(var)205 (HP1). The second class includes proteins with a single chromo domain, eg. Drosophila protein Polycomb (Pc); mammalian modifier 3; human Mi-2 autoantigenand and several yeast and Caenorhabditis elegans hypothetical proteins. In the third class paired tandem chromo domains are found, eg. in mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1.

    Proteins where this domain is known:
    PF11_0418    PFL1005c   


    PS50014 - BROMODOMAIN_2 (Prosite link)

    Interpro entry IPR001487 : (Interpro link)

    Interpro description:
    Bromodomains are found in a variety of mammalian, invertebrate and yeast DNA-binding proteins. Bromodomains can interact with acetylated lysine. In some proteins, the classical bromodomain has diverged to such an extent that parts of the region are either missing or contain an insertion (e.g., mammalian protein HRX, Caenorhabditis elegans hypothetical protein ZK783.4, yeast protein YTA7). The bromodomain may occur as a single copy, or in duplicate.

    The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.

    Proteins where this domain is known:
    PF08_0034    PF10_0328    PF14_0724    PFA0510w    PFF1440w    PFL0635c    PFL1645w   


    PS50016 - ZF_PHD_2 (Prosite link)

    Interpro entry IPR001965 : Zinc finger, PHD-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the PHD (homeodomain) zinc finger domain, which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.

    The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF10_0079    PF11_0429    PFF1185w    PFF1440w    PFL0575w    PFL1010c   


    PS50020 - WW_DOMAIN_2 (Prosite link)

    Interpro entry IPR001202 : WW/Rsp5/WWP (Interpro link)

    Interpro description:

    Synonym(s): Rsp5 or WWP domain

    The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.

    A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.

    Proteins where this domain is known:
    MAL8P1.40    PF11_0118    PF13_0091    PF13_0315    PF14_0096    PFL1745c   


    PS50021 - CH (Prosite link)

    Interpro entry IPR001715 : (Interpro link)

    Interpro description:

    The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:

    A comprehensive review of proteins containing this type of actin-binding domains is given in.

    The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.

    Proteins where this domain is known:
    MAL8P1.136    PF14_0454    PFC0305w    PFI1450c   


    PS50022 - FA58C_3 (Prosite link)

    Interpro entry IPR000421 : Coagulation factor 5/8 type, C-terminal (Interpro link)

    Interpro description:
    Blood coagulation factors V and VIII contain a C-terminal, twice repeated, domain of about 150 amino acids, which is called F5/8 type C, FA58C, or C1/C2- like domain. In the Dictyostelium discoideum (Slime mold) cell adhesion protein discoidin, a related domain, named discoidin I-like domain, DLD, or DS, has been found which shares a common C-terminal region of about 110 amino acids with the FA58C domain, but whose N-terminal 40 amino acids are much less conserved. Similar domains have been detected in other extracellular and membrane proteins In coagulation factors V and VIII the repeated domains compose part of a larger functional domain which promotes binding to anionic phospholipids on the surface of platelets and endothelial cells. The C-terminal domain of the second FA58C repeat (C2) of coagulation factor VIII has been shown to be responsible for phosphatidylserine-binding and essential for activity. It forms an amphipathic alpha-helix, which binds to the membrane. FA58C contains two conserved cysteines in most proteins, which link the extremities of the domain by a disulphide bond. A further disulphide bond is located near the C-terminal of the second FA58C domain in MFGM

    Proteins where this domain is known:
    PF14_0532    PF14_0723   


    PS50026 - EGF_3 (Prosite link)

    Interpro entry IPR000742 : (Interpro link)

    Interpro description:
    A sequence of about thirty to forty amino-acid residues long found in the sequence of epidermal growth factor (EGF) has been shown to be present, in a more or less conserved form, in a large number of other, mostly animal proteins. The functional significance of EGF domains in what appear to be unrelated proteins is not yet clear. However, a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six cysteine residues which have been shown (in EGF) to be involved in disulphide bonds. The main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet. Subdomains between the conserved cysteines vary in length.

    Proteins where this domain is known:
    PFF1120c   


    PS50030 - UBA (Prosite link)

    Interpro entry IPR015940 : (Interpro link)

    Interpro description:

    UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded beta-sheet of ubiquitin.

    This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts.

    More information about EF1B (EF-Ts) proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PF10_0114    PF11_0142    PF11_0329    PF13_0301    PFD0655c    PFI1525w   


    PS50031 - EH (Prosite link)

    Interpro entry IPR000261 : (Interpro link)

    Interpro description:

    The EH (for Eps15 Homology) domain is a protein-protein interaction module of approximately 95 residues which was originally identified as a repeated sequence present in three copies at the N-terminus of the tyrosine kinase substrates Eps15 and Eps15R . The EH domain was subsequently found in several proteins implicated in endocytosis, vesicle transport and signal transduction in organisms ranging from yeast to mammals. EH domains are present in one to three copies and they may include calcium-binding domains of the EF-hand type. Eps15 is divided into three domains: domain I contains signatures of a regulatory domain, including a candidate tyrosine phosphorylation site and EF-hand-type calcium-binding domains, domain II presents the characteristic heptad repeats of coiled-coil rod-like proteins, and domain III displays a repeated aspartic acid-proline-phenylalanine motif similar to a consensus sequence of several methylases.

    EH domains have been shown to bind specifically but with moderate affinity to peptides containing short, unmodified motifs through predominantly hydrophobic interactions. The target motifs are divided into three classes: class I consists of the concensus Asn-Pro-Phe (NPF) sequence; class II consists of aromatic and hydrophobic di- and tripeptide motifs, including the Phe-Trp (FW), Trp-Trp (WW), and Ser-Trp-Gly (SWG) motifs; and class III contains the His-(Thr/Ser)-Phe motif (HTF/HSF). The structure of several EH domains has been solved by NMR spectroscopy. The fold consists of two helix-loop-helix characteristic of EF-hand domains, connected by a short antiparallel beta-sheet. The target peptide is bound in a hydrophobic pocket between two alpha helices. Sequence analysis and structural data indicate that not all the EF-hands are capable of binding calcium because of substitutions of the calcium-liganding residues in the loop.

    This domain is often implicated in the regulation of protein transport/sorting and membrane trafficking. Messenger RNA translation initiation and cytoplasmic poly(A) tail shortening require the poly(A)-binding protein (PAB) in yeast. The PAB-dependent poly(A) ribonuclease (PAN) is organised into distinct domains containing repeated sequence elements.

    Proteins where this domain is known:
    PF10_0244    PFC0190c   


    PS50033 - UBX (Prosite link)

    Interpro entry IPR001012 : (Interpro link)

    Interpro description:
    The UBX domain is found in ubiquitin-regulatory proteins, which are members of the ubiquitination pathway, as well as a number of other proteins including FAF-1 (FAS-associated factor 1), the human Rep-8 reproduction protein and several hypothetical proteins from yeast. The function of the UBX domain is not known although the fragment of avian FAF-1 containing the UBX domain causes apoptosis of transfected cells.

    Proteins where this domain is known:
    MAL8P1.122    PFI1680w   


    PS50035 - PLD (Prosite link)

    Interpro entry IPR001736 : Phospholipase D/Transphosphatidylase (Interpro link)

    Interpro description:

    Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs.

    Proteins where this domain is known:
    MAL8P1.58    PFF0465c    PFI0755c   


    PS50040 - EF1G_C (Prosite link)

    Interpro entry IPR001662 : Translation elongation factor EF1B, gamma chain, conserved (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    This entry represents a conserved domain usually found near the C-terminus of EF1B-gamma chains, a peptide of 410-440 residues. The gamma chain appears to play a role in anchoring the EF1B complex to the beta and delta chains and to other cellular components.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PF13_0214   


    PS50042 - CNMP_BINDING_3 (Prosite link)

    Interpro entry IPR000595 : (Interpro link)

    Interpro description:
    Proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues. The best studied of these proteins is the prokaryotic catabolite gene activator (also known as the cAMP receptor protein) (gene crp) where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure. There are six invariant amino acids in this domain, three of which are glycine residues that are thought to be essential for maintenance of the structural integrity of the beta-barrel. cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain. The cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain. The cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section. Vertebrate cyclic nucleotide-gated ion-channels also contain this domain. Two such cations channels have been fully characterised, one is found in rod cells where it plays a role in visual signal transduction.

    Proteins where this domain is known:
    PF14_0172    PF14_0173    PF14_0346    PFL1110c   


    PS50051 - MCM_2 (Prosite link)

    Interpro entry IPR001208 : DNA-dependent ATPase MCM (Interpro link)

    Interpro description:

    MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.

    This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.

    The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.

    The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.

    Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.

    Proteins where this domain is known:
    PF07_0023    PF13_0095    PF13_0291    PF14_0177    PFD0790c    PFE1345c    PFL0560c    PFL0580w   


    PS50052 - GUANYLATE_KINASE_2 (Prosite link)

    Interpro entry IPR008144 : (Interpro link)

    Interpro description:

    Guanylate kinase (GK) catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes (such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.

    Proteins containing one or more copies of the DHR domain, an SH3 domain as well as a C-terminal GK-like domain, are collectively termed MAGUKs (membrane-associated guanylate kinase homologs), and include Drosophila lethal(1)discs large-1 tumor suppressor protein (gene dlg1); mammalian tight junction protein Zo-1; a family of mammalian synaptic proteins that seem to interact with the cytoplasmic tail of NMDA receptor subunits (SAP90/PSD-95, CHAPSYN-110/PSD-93, SAP97/DLG1 and SAP102); vertebrate 55 kD erythrocyte membrane protein (p55); Caenorhabditis elegans protein lin-2; rat protein CASK; and human proteins DLG2 and DLG3. There is an ATP-binding site (P-loop) in the N-terminal section of GK, which is not conserved in the GK-like domain of the above proteins. However these proteins retain the residues known, in GK, to be involved in the binding of GMP.

    Proteins where this domain is known:
    PFI1420w   


    PS50053 - UBIQUITIN_2 (Prosite link)

    Interpro entry IPR000626 : Ubiquitin (Interpro link)

    Interpro description:

    Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.

    Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.

    Proteins where this domain is known:
    MAL13P1.64    MAL8P1.62    PF08_0067    PF10_0114    PF11_0142    PF11_0329    PF13_0084    PF13_0346    PF14_0027    PFE0285c    PFE0380c    PFE1355c    PFI1085w    PFL0585w    PFL1830w   


    PS50054 - TYR_PHOSPHATASE_DUAL (Prosite link)

    Interpro entry IPR000340 : Protein-tyrosine phosphatase, dual specificity (Interpro link)

    Interpro description:

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:

    Based on their cellular localisation, PTPases are also classified as:

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    This entry represents dual specificity protein-tyrosine phosphatases. Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. The crystal structure of a human DSP, vaccinia H1-related phosphatase (or VHR), has been determined at 2.1 angstrom resolution. A shallow active site pocket in VHR allows for the hydrolysis of phosphorylated serine, threonine, or tyrosine protein residues, whereas the deeper active site of protein tyrosine phosphatases (PTPs) restricts substrate specificity to only phosphotyrosine. Positively charged crevices near the active site may explain the enzyme's preference for substrates with two phosphorylated residues. The VHR structure defines a conserved structural scaffold for both DSPs and PTPs. A "recognition region" connecting helix alpha1 to strand beta1, may determine differences in substrate specificity between VHR, the PTPs, and other DSPs.

    These proteins may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect again dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre.

    Proteins where this domain is known:
    PF14_0524    PFC0380w   


    PS50056 - TYR_PHOSPHATASE_2 (Prosite link)

    Interpro entry IPR000387 : Protein-tyrosine phosphatase (Interpro link)

    Interpro description:

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:

    Based on their cellular localisation, PTPases are also classified as:

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    This entry includes proteins of two subfamilies: Ser/Thr and Tyr dual specificity protein phosphatase and tyrosine specific protein phosphatase. Both of these subfamilies may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect against dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a regulatory centre.

    Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. Tyrosine specific protein phosphatases catalyze the removal of a phosphate group attached to a tyrosine residue. They are also very important in the control of cell growth, proliferation, differentiation and transformation.

    Proteins where this domain is known:
    PF11_0139    PF11_0281    PFC0380w   


    PS50059 - FKBP_PPIASE (Prosite link)

    Interpro entry IPR001179 : Peptidyl-prolyl cis-trans isomerase, FKBP-type (Interpro link)

    Interpro description:

    Synonym(s): Peptidylprolyl cis-trans isomerase

    FKBP-type peptidylprolyl isomerases in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms.

    Proteins where this domain is known:
    MAL13P1.68    PFL2275c   


    PS50064 - PARP_ZN_FINGER_2 (Prosite link)

    Interpro entry IPR001510 : Zinc finger, PARP-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents PARP (Poly(ADP) polymerase) type zinc finger domains.

    NAD(+) ADP-ribosyltransferase is a eukaryotic enzyme that catalyses the covalent attachment of ADP-ribose units from NAD(+) to various nuclear acceptor proteins. This post-translational modification of nuclear proteins is dependent on DNA. It appears to be involved in the regulation of various important cellular processes such as differentiation, proliferation and tumour transformation as well as in the regulation of the molecular events involved in the recovery of the cell from DNA damage. Structurally, NAD(+) ADP-ribosyltransferase consists of three distinct domains: an N-terminal zinc-dependent DNA-binding domain, a central automodification domain and a C-terminal NAD-binding domain. The DNA-binding region contains a pair of PARP-type zinc finger domains which have been shown to bind DNA in a zinc-dependent manner. The PARP-type zinc finger domains seem to bind specifically to single-stranded DNA and to act as a DNA nick sensor. DNA ligase III contains, in its N-terminal section, a single copy of a zinc finger highly similar to those of PARP.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFL2440w   


    PS50067 - KINESIN_MOTOR_DOMAIN2 (Prosite link)

    Interpro entry IPR001752 : Kinesin, motor region (Interpro link)

    Interpro description:

    Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.

    The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.

    A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:

    The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.

    The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.

    Proteins where this domain is known:
    MAL8P1.132    PF07_0104    PF11_0478    PFA0535c    PFC0770c    PFC0860w    PFL0545w    PFL2165w    PFL2190c   


    PS50069 - CULLIN_2 (Prosite link)

    Interpro entry IPR016158 : Cullin homology (Interpro link)

    Interpro description:

    Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF), play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins.

    This entry represents the cullin homology region, which is composed of three domains: a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.

    Proteins where this domain is known:
    PF08_0094    PFF1445c   


    PS50070 - KRINGLE_2 (Prosite link)

    Interpro entry IPR000001 : (Interpro link)

    Interpro description:
    Kringles are autonomous structural domains, found throughout the blood clotting and fibrinolytic proteins. Kringle domains are believed to play a role in binding mediators (e.g., membranes, other proteins or phospholipids), and in the regulation of proteolytic activity. Kringle domains are characterised by a triple loop, 3-disulphide bridge structure, whose conformation is defined by a number of hydrogen bonds and small pieces of anti-parallel beta-sheet. They are found in a varying number of copies in some plasma proteins including prothrombin and urokinase-type plasminogen activator, which are serine proteases belonging to MEROPS peptidase family S1A.

    Proteins where this domain is known:
    PFI0550w   


    PS50072 - CSA_PPIASE_2 (Prosite link)

    Interpro entry IPR002130 : Peptidyl-prolyl cis-trans isomerase, cyclophilin-type (Interpro link)

    Interpro description:

    Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.

  • Note: FKBP's, a family of proteins that bind the immunosuppressive drug FK506, are also PPIases, but their sequence is not at all related to that of cyclophilin.
  • Proteins where this domain is known:
    PF08_0121    PF08_0128    PF11_0164    PF11_0170    PF14_0223    PFC0975c    PFE0505w    PFE1430c    PFI1490c    PFL0120c    PFL0735w   


    PS50075 - ACP_DOMAIN (Prosite link)

    Interpro entry IPR009081 : Acyl carrier protein-like (Interpro link)

    Interpro description:

    Acyl carrier protein (ACP) is an essential cofactor in the synthesis of fatty acids by the fatty acid synthetases systems in bacteria and plants. In addition to fatty acid synthesis, ACP is also involved in many other reactions that require acyl transfer steps, such as the synthesis of polyketide antibiotics, biotin precursor, membrane-derived oligosaccharides, and activation of toxins, and functions as an essential cofactor in lipoylation of pyruvate and alpha-ketoglutarate dehydrogenase complexes. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. Phosphopantetheine is attached to a serine residue in these proteins. The core structure of ACP consists of a four-helical bundle, where helix three is shorter than the others.

    Several other proteins share structural homology with ACP, such as the bacterial apo-D-alanyl carrier protein, which facilitates the incorporation of D-alanine into lipoteichoic acid by a ligase, necessary for the growth and development of Gram-positive organisms; and the thioester domain of the bacterial peptide carrier protein (PCP) found within large modular non-ribosomal peptide synthetases, which are responsible for the synthesis of a variety of microbial bioactive peptides.

    Proteins where this domain is known:
    PFB0385w    PFL0415w   


    PS50076 - DNAJ_2 (Prosite link)

    Interpro entry IPR001623 : Heat shock protein DnaJ, N-terminal (Interpro link)

    Interpro description:

    The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.

    Such a structure is shown in the following schematic representation:

    It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.

    Proteins where this domain is known:
    MAL13P1.162    MAL13P1.277    MAL8P1.204    PF07_0103    PF08_0032    PF08_0115    PF10_0032    PF10_0058    PF10_0378    PF10_0381    PF11_0034    PF11_0099    PF11_0273    PF11_0380    PF11_0433    PF11_0443    PF11_0509    PF11_0512    PF11_0513    PF13_0036    PF13_0102    PF14_0013    PF14_0137    PF14_0213    PF14_0359    PF14_0700    PFA0110w    PFA0660w    PFA0675w    PFB0085c    PFB0090c    PFB0595w    PFB0920w    PFB0925w    PFD0462w    PFE0040c    PFE0055c    PFE0135w    PFE1170w    PFF1010c    PFF1415c    PFI0935w    PFI0985c    PFL0055c    PFL0565w    PFL0815w    PFL2550w   


    PS50077 - HEAT_REPEAT (Prosite link)

    Interpro entry IPR000357 : (Interpro link)

    Interpro description:

    The HEAT repeat is a tandemly repeated, 37-47 amino acid long module occurring in a number of cytoplasmic proteins, including the four name-giving proteins huntingtin, elongation factor 3 (EF3), the 65 Kd alpha regulatory subunit of protein phosphatase 2A (PP2A) and the yeast PI3-kinase TOR1. Arrays of HEAT repeats consists of 3 to 36 units forming a rod-like helical structure and appear to function as protein-protein interaction surfaces. It has been noted that many HEAT repeat-containing proteins are involved in intracellular transport processes.

    In the crystal structure of PP2A PR65/A, the HEAT repeats consist of pairs of antiparallel alpha helices, as predicted in.

    Proteins where this domain is known:
    PFE0170c   


    PS50081 - ZF_DAG_PE_2 (Prosite link)

    Interpro entry IPR002219 : Protein kinase C, phorbol ester/diacylglycerol binding (Interpro link)

    Interpro description:

    Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC). Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.

    Proteins where this domain is known:
    PFI1485c   


    PS50082 - WD_REPEATS_2 (Prosite link)

    Interpro entry IPR001680 : (Interpro link)

    Interpro description:

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    Proteins where this domain is known:
    MAL13P1.142    MAL13P1.148    MAL13P1.264    MAL13P1.385    MAL13P1.54    MAL7P1.81    MAL8P1.145    MAL8P1.43    PF07_0017    PF07_0092    PF08_0019    PF08_0065    PF08_0130    PF10_0128    PF10_0261    PF10_0326    PF11_0056    PF11_0171    PF11_0222    PF11_0471    PF13_0149    PF13_0250    PF13_0335    PF14_0087    PF14_0101    PF14_0243    PF14_0263    PF14_0314    PF14_0456    PFA0520c    PFC0100c    PFC0365w    PFD0455w    PFE0090w    PFE0505w    PFE0540w    PFE0930w    PFE1270c    PFF0330w    PFF0395c    PFF1000w    PFF1480w    PFI0290c    PFI1080w    PFL0470w    PFL0610w    PFL0970w    PFL1040w    PFL1290w    PFL1820w    PFL1975c    PFL2460w   


    PS50084 - KH_TYPE_1 (Prosite link)

    Interpro entry IPR004088 : K Homology, type 1 (Interpro link)

    Interpro description:

    The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins. It has been shown to bind RNA. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently.

    According to structural analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-1 KH domain include bacterial polyribonucleotide nucleotidyltransferases; vertebrate fragile X mental retardation protein 1 (FMR1); eukaryotic heterogeneous nuclear ribonucleoprotein K (hnRNP K), one of at least 20 major proteins that are part of hnRNP particles in mammalian cells; mammalian poly(rC) binding proteins; Artemia salina glycine-rich protein GRP33; yeast PAB1-binding protein 2 (PBP2); vertebrate vigilin; and human high-density lipoprotein binding protein (HDL-binding protein).

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    PF10_0115    PF14_0151    PFF0250w    PFF1135w   


    PS50086 - TBC_RABGAP (Prosite link)

    Interpro entry IPR000195 : RabGAP/TBC (Interpro link)

    Interpro description:
    Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, imply that these domains are GTPase activator proteins of Rab-like small GTPases.

    Proteins where this domain is known:
    MAL13P1.244    PF11_0151    PF13_0117    PF14_0699    PFE0330w    PFI0195c    PFI0345w    PFL1445w   


    PS50088 - ANK_REPEAT (Prosite link)

    Interpro entry IPR002110 : (Interpro link)

    Interpro description:

    The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.

    The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.

    Proteins where this domain is known:
    MAL13P1.126    MAL13P1.71    MAL8P1.28    PF10_0102    PF10_0213    PF10_0328    PF11_0197    PF14_0222    PFC0160w    PFE0400w    PFF1315w    PFF1365c   


    PS50089 - ZF_RING_2 (Prosite link)

    Interpro entry IPR001841 : Zinc finger, RING-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents RING-type zinc finger domains. The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions.. There are two different variants, the C3HC4-type and a C3H2C3-type, which are clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as 'RING-H2 finger'. The RING domain is a protein interaction domain that has been implicated in a range of diverse biological processes. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain. E3 ubiquitin-protein ligases determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3. Various RING fingers also exhibit binding to E2 ubiquitin-conjugating enzymes (Ubc's).

    Several 3D-structures for RING-fingers are known. The 3D structure of the zinc ligation system is unique to the RING domain and is referred to as the 'cross-brace' motif. The spacing of the cysteines in such a domain is C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. Metal ligand pairs one and three co-ordinate to bind one zinc ion, whilst pairs two and four bind the second, as illustrated in the following schematic representation:

    Note that in the older literature, some RING-fingers are denoted as LIM-domains. The LIM-domain Zn-finger is a fundamentally different family, albeit with similar Cys-spacing.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.216    MAL13P1.224    MAL13P1.76    MAL7P1.155    PF10_0046    PF10_0072    PF10_0117    PF10_0276    PF11_0244    PF13_0188    PF14_0054    PF14_0215    PF14_0416    PFB0440c    PFC0510w    PFC0610c    PFC0690c    PFC0740c    PFC0845c    PFD0765w    PFE0100w    PFE0610c    PFE1070c    PFE1490c    PFF0165c    PFF0355c    PFF0755c    PFF1180w    PFF1185w    PFF1325c    PFL0275w    PFL0440c    PFL1010c    PFL1705w   


    PS50090 - MYB_LIKE (Prosite link)

    Interpro entry IPR017877 : (Interpro link)

    Interpro description:
    The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognize the sequence YAAC(G/T)G. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding.

    Proteins where this domain is known:
    PF11_0241    PF13_0088    PFI1480w   


    PS50092 - TSP1 (Prosite link)

    Interpro entry IPR000884 : (Interpro link)

    Interpro description:

    Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain , three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers . EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.

    This repeat was first described in 1986 by Lawler and Hynes. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) as well as extracellular matrix protein like mindin, F-spondin, SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium contain one or more instance of this repeat. It has been involved in cell-cell interraction, inhibition of angiogenesis and apoptosis.

    The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat.

    Proteins where this domain is known:
    MAL8P1.45    PF13_0201    PFA0200w    PFC0210c    PFC0640w    PFF0800w    PFL0870w   


    PS50095 - PLAT (Prosite link)

    Interpro entry IPR001024 : (Interpro link)

    Interpro description:

    Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:

    The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.

    This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.

    Proteins where this domain is known:
    PF14_0067   


    PS50096 - IQ (Prosite link)

    Interpro entry IPR000048 : (Interpro link)

    Interpro description:

    Calmodulin (CaM) is recognized as a major calcium sensor and orchestrator of regulatory events through its interaction with a diverse group of cellular proteins. Three classes of recognition motifs exist for many of the known CaM binding proteins; the IQ motif as a consensus for Ca2+-independent binding and two related motifs for Ca2+-dependent binding, termed 18-14 and 1-5-10 based on the position of conserved hydrophobic residues.

    The regulatory domain of scallop myosin is a three-chain protein complex that switches on this motor in response to Ca2+ binding. Side-chain interactions link the two light chains in tandem to adjacent segments of the heavy chain bearing the IQ-sequence motif. The Ca2+-binding site is a novel EF-hand motif on the essential light chain and is stabilized by linkages involving the heavy chain and both light chains, accounting for the requirement of all three chains for Ca2+ binding and regulation in the intact myosin molecule.

    Proteins where this domain is known:
    PF11_0416    PF11_0540    PF14_0224    PFL0975w   


    PS50102 - RRM (Prosite link)

    Interpro entry IPR000504 : RNA recognition motif, RNP-1 (Interpro link)

    Interpro description:

    Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu). The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition. The motif also appears in a few single stranded DNA binding proteins.

    The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases. The RRM is reviewed in a number of publications.

    Proteins where this domain is known:
    MAL13P1.120    MAL13P1.303    MAL13P1.338    MAL13P1.35    MAL7P1.126    MAL7P1.157a    MAL8P1.40    MAL8P1.83    PF07_0066    PF08_0086    PF10_0028    PF10_0047    PF10_0068    PF10_0194    PF10_0214    PF10_0217    PF10_0235    PF11_0083    PF11_0111    PF11_0200    PF11_0205    PF11_0279    PF11_0330    PF11_0402    PF13_0058    PF13_0122    PF13_0147    PF13_0165    PF13_0278    PF13_0315    PF13_0318    PF14_0028    PF14_0056    PF14_0057    PF14_0096    PF14_0194    PF14_0433    PF14_0513    PF14_0656    PFB0255w    PFC0865w    PFD0700c    PFD0750w    PFD0775c    PFE0160c    PFE0750c    PFE0865c    PFE0885w    PFF0150c    PFF0300w    PFF0320c    PFF0505c    PFF0760w    PFF1425w    PFI0820c    PFI1025w    PFI1175c    PFI1435w    PFI1600w    PFI1695c    PFL0375w    PFL0830w    PFL1170w    PFL1200c    PFL1705w    PFL1745c    PFL1990c    PFL2310w   


    PS50105 - SAM_DOMAIN (Prosite link)

    Interpro entry IPR001660 : (Interpro link)

    Interpro description:

    The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins, nevertheless with a low affinity constant. SAM domains also appear to possess the ability to bind RNA. Smaug  a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA  binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.

    Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures.

    Proteins where this domain is known:
    PF11_0079    PF13_0258   


    PS50106 - PDZ (Prosite link)

    Interpro entry IPR001478 : PDZ/DHR/GLGF (Interpro link)

    Interpro description:

    PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.

    PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.

    Proteins where this domain is known:
    MAL8P1.98    PFC0330w   


    PS50115 - ARFGAP (Prosite link)

    Interpro entry IPR001164 : Arf GTPase activating protein (Interpro link)

    Interpro description:

    This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.

    The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.

    The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.

    Proteins where this domain is known:
    PF08_0120    PFE1305c    PFL2140c   


    PS50118 - HMG_BOX_2 (Prosite link)

    Interpro entry IPR000910 : High mobility group, HMG1/HMG2 (Interpro link)

    Interpro description:

    High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.

    The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.

    Proteins where this domain is known:
    MAL13P1.290    MAL8P1.72    PFL0145c   


    PS50119 - ZF_BBOX (Prosite link)

    Interpro entry IPR000315 : Zinc finger, B-box (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitinylation. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins.

    The microtubule-associated E3 ligase MID1 contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a beta/beta/alpha cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.37    PF14_0383    PFC0345w    PFE0895c   


    PS50125 - GUANYLATE_CYCLASE_2 (Prosite link)

    Interpro entry IPR001054 : Adenylyl cyclase class-3/4/guanylyl cyclase (Interpro link)

    Interpro description:

    Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.

    In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.

    Proteins where this domain is known:
    MAL13P1.301    MAL8P1.150    PF11_0395    PF14_0043   


    PS50126 - S1 (Prosite link)

    Interpro entry IPR003029 : S1, RNA binding (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site.

    The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein.

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    MAL8P1.101    PF07_0117    PF10_0294    PFD0515w    PFE0830c   


    PS50127 - UBIQUITIN_CONJUGAT_2 (Prosite link)

    Interpro entry IPR000608 : Ubiquitin-conjugating enzyme, E2 (Interpro link)

    Interpro description:

    The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.

    There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.

    Proteins where this domain is known:
    MAL13P1.227    PF08_0085    PF10_0330    PF13_0301    PF14_0128    PFC0255c    PFC0855w    PFE1350c    PFF0305c    PFI0740c    PFI1030c    PFL0190w    PFL2100w    PFL2175w   


    PS50128 - SURP (Prosite link)

    Interpro entry IPR000061 : SWAP/Surp (Interpro link)

    Interpro description:
    SWAP is derived from the Suppressor-of-White-APricot splicing regulator from Drosophila melanogaster. The domain is found in regulators responsible for pervasive, nonsex-specific alternative pre-mRNA splicing characteristics and has been found in splicing regulatory proteins. These ancient, conserved SWAP proteins share a colinearly arrayed series of novel sequence motifs.

    Proteins where this domain is known:
    PF14_0028    PF14_0713    PFF1165c   


    PS50135 - ZF_ZZ_2 (Prosite link)

    Interpro entry IPR000433 : Zinc finger, ZZ-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents ZZ-type zinc finger domains, named because of their ability to bind two zinc ions. These domains contain 4-6 Cys residues that participate in zinc binding (plus additional Ser/His residues), including a Cys-X2-Cys motif found in other zinc finger domains. These zinc fingers are thought to be involved in protein-protein interactions. The structure of the ZZ domain shows that it belongs to the family of cross-brace zinc finger motifs that include the PHD, RING, and FYVE domains. ZZ-type zinc finger domains are found in:

    Single copies of the ZZ zinc finger occur in the transcriptional adaptor/coactivator proteins P300, in cAMP response element-binding protein (CREB)-binding protein (CBP) and ADA2. CBP provides several binding sites for transcriptional coactivators. The site of interaction with the tumour suppressor protein p53 and the oncoprotein E1A with CBP/P300 is a Cys-rich region that incorporates two zinc-binding motifs: ZZ-type and TAZ2-type. The ZZ-type zinc finger of CBP contains two twisted anti-parallel beta-sheets and a short alpha-helix, and binds two zinc ions. One zinc ion is coordinated by four cysteine residues via 2 Cys-X2-Cys motifs, and the third zinc ion via a third Cys-X-Cys motif and a His-X-His motif. The first zinc cluster is strictly conserved, whereas the second zinc cluster displays variability in the position of the two His residues.

    In Arabidopsis thaliana (Mouse-ear cress), the hypersensitive to red and blue 1 (Hrb1) protein, which regulating both red and blue light responses, contains a ZZ-type zinc finger domain.

    ZZ-type zinc finger domains have also been identified in the testis-specific E3 ubiquitin ligase MEX that promotes death receptor-induced apoptosis. MEX has four putative zinc finger domains: one ZZ-type, one SWIM-type and two RING-type. The region containing the ZZ-type and RING-type zinc fingers is required for interaction with UbcH5a and MEX self-association, whereas the SWIM domain was critical for MEX ubiquitination.

    In addition, the Cys-rich domains of dystrophin, utrophin and an 87kDa post-synaptic protein contain a ZZ-type zinc finger with high sequence identity to P300/CBP ZZ-type zinc fingers. In dystrophin and utrophin, the ZZ-type zinc finger lies between a WW domain (flanked by and EF hand) and the C-terminal coiled-coil domain. Dystrophin is thought to act as a link between the actin cytoskeleton and the extracellular matrix, and perturbations of the dystrophin-associated complex, for example, between dystrophin and the transmembrane glycoprotein beta-dystroglycan, may lead to muscular dystrophy. Dystrophin and its autosomal homologue utrophin interact with beta-dystroglycan via their C-terminal regions, which are comprised of a WW domain, an EF hand domain and a ZZ-type zinc finger domain. The WW domain is the primary site of interaction between dystrophin or utrophin and dystroglycan, while the EF hand and ZZ-type zinc finger domains stabilise and strengthen this interaction.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0026    PF10_0143   


    PS50138 - BRCA2_REPEAT (Prosite link)

    Interpro entry IPR018231 : (Interpro link)

    Interpro description:

    The breast cancer type 2 susceptibility protein has a number of 39 amino acid repeats that are critical for binding to RAD51 (a key protein in DNA recombinational repair) and resistance to methyl methanesulphonate treatment. BRCA2 is a breast tumour suppressor with a potential function in the cellular response to DNA damage. At the cellular level, expression is regulated in a cell-cycle dependent manner and peak expression of BRCA2 mRNA is found in S phase, suggesting BRCA2 may participate in regulating cell proliferation. There are eight repeats in BRCA2 designated as BRC1 to BRC8. BRC1, BRC2, BRC3, BRC4, BRC7, and BRC8 are highly conserved and bind to Rad51, whereas BRC5 and BRC6 are less well conserved and do not bind to Rad51. It has been suggested that BRCA2 plays a role in positioning Rad51 at the site of DNA repair or in removing Rad51 from DNA once repair has been completed.

    Proteins where this domain is known:
    PF13_0155   


    PS50144 - MATH (Prosite link)

    Interpro entry IPR002083 : (Interpro link)

    Interpro description:

    Although apparently functionally unrelated, intracellular TRAFs and extracellular meprins share a conserved region of about 180 residues, the meprin and TRAF homology (MATH) domain. Meprins are mammalian tissue-specific metalloendopeptidases of the astacin family implicated in developmental, normal and pathological processes by hydrolysing a variety of proteins. Various growth factors, cytokines, and extracellular matrix proteins are substrates for meprins. They are composed of five structural domains: an N-terminal endopeptidase domain, a MAM domain (see, a MATH domain, an EGF-like domain (see and a C-terminal transmembrane region. Meprin A and B form membrane bound homotetramer whereas homooligomers of meprin A are secreted. A proteolitic site adjacent to the MATH domain, only present in meprin A, allows the release of the protein from the membrane.

    TRAF proteins were first isolated by their ability to interact with TNF receptors . They promote cell survival by the activation of downstream protein kinases and, finally, transcription factors of the NF-kB and AP-1 family. The TRAF proteins are composed of 3 structural domains: a RING finger (see in the N-terminal part of the protein, one to seven TRAF zinc fingers (see in the middle and the MATH domain in the C-terminal part . The MATH domain is necessary and sufficient for self-association and receptor interaction. From the structural analysis two consensus sequence recognized by the TRAF domain have been defined: a major one, [PSAT]x[QE]E and a minor one, PxQxxD.

    The structure of the TRAF2 protein reveals a trimeric self-association of the MATH domain. The domain forms a new, light-stranded antiparallel beta sandwich structure. A coiled-coil region adjacent to the MATH domain is also important for the trimerisation. The oligomerisation is essential for establishing appropriate connections to form signalling complexes with TNF receptor-1. The ligand binding surface of TRAF proteins is located in beta-strands 6 and 7.

    Proteins where this domain is known:
    PFE0570w   


    PS50156 - SSD (Prosite link)

    Interpro entry IPR000731 : Sterol-sensing 5TM box (Interpro link)

    Interpro description:

    The sterol-sensing domain (SSD) consists of approximately 180 amino acids organised into a cluster of five consecutive membrane-spanning domains and is found in proteins which have key roles in different aspects of cholesterol homeostasis or cholesterol-linked signalling such as sterol-regulated movement or the trafficking of specific cargoes. Examples of proteins containing SSDs include the Hedgehog signalling protein (Patched protein) from Drosophila; 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGCR), which is involved in the control of cholesterol biosynthesis; SREBP cleavage-activating protein (SCAP); the Niemann-Pick type C (NPC1) protein; and a number of bacterial drug resistance proteins.

    The role of the SSD is still open to debate. The domain may may either bind directly to sterols, sterol-modified proteins or proteins that change conformation in response to sterol levels, or trigger an intramolecular response in response to sterols.

    Proteins where this domain is known:
    PFA0375c   


    PS50157 - ZINC_FINGER_C2H2_2 (Prosite link)

    Interpro entry IPR007087 : Zinc finger, C2H2-type (Interpro link)

    Interpro description:

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the classical C2H2 type zinc finger domain.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0118    PF10_0058    PF14_0657    PFC0690c    PFL0465c   


    PS50158 - ZF_CCHC (Prosite link)

    Interpro entry IPR001878 : Zinc finger, CCHC-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:

    where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF14_0139   


    PS50159 - RIBOSOMAL_S13_2 (Prosite link)

    Interpro entry IPR001892 : Ribosomal protein S13 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. It is a basic protein of 115 to 177 amino-acid residues. This family of ribosomal proteins is present in procaryotes and eukaryotes.

    Proteins where this domain is known:
    PF11_0272   


    PS50160 - DNA_LIGASE_A3 (Prosite link)

    Interpro entry IPR012310 : ATP dependent DNA ligase, central (Interpro link)

    Interpro description:

    This domain belongs to a more diverse superfamily, including catalytic domain of the mRNA capping enzyme and NAD-dependent DNA ligase.

    Proteins where this domain is known:
    MAL13P1.22   


    PS50162 - RECA_2 (Prosite link)

    Interpro entry IPR001553 : RecA bacterial DNA recombination (Interpro link)

    Interpro description:

    The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage.

    RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved among eubacterial species. It is also found in the chloroplast of plants. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, ß-strand 3, the loop C-terminal to ß-strand 2, and alpha-helix D of the core domain form one surface that packs against alpha-helix A and ß-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between ß-strand 1 and alpha-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at ß-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP binding interactions are contributed by the amino acid residues at ß-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.

    Proteins where this domain is known:
    MAL8P1.76    PF11_0087   


    PS50163 - RECA_3 (Prosite link)

    Interpro entry IPR001553 : RecA bacterial DNA recombination (Interpro link)

    Interpro description:

    The recA gene product is a multifunctional enzyme that plays a role in homologous recombination, DNA repair and induction of the SOS response. In homologous recombination, the protein functions as a DNA-dependent ATPase, promoting synapsis, heteroduplex formation and strand exchange between homologous DNAs. RecA also acts as a protease cofactor that promotes autodigestion of the lexA product and phage repressors. The proteolytic inactivation of the lexA repressor by an activated form of recA may cause a derepression of the 20 or so genes involved in the SOS response, which regulates DNA repair, induced mutagenesis, delayed cell division and prophage induction in response to DNA damage.

    RecA is a protein of about 350 amino-acid residues. Its sequence is very well conserved among eubacterial species. It is also found in the chloroplast of plants. RecA-like proteins are found in archaea and diverse eukaryotic organisms, like fission yeast, mouse or human. In the filament visualised by X-ray crystallography, ß-strand 3, the loop C-terminal to ß-strand 2, and alpha-helix D of the core domain form one surface that packs against alpha-helix A and ß-strand 0 (the N-terminal domain) of an adjacent monomer during polymerisation. The core ATP-binding site domain is well conserved, with 14 invariant residues. It contains the nucleotide binding loop between ß-strand 1 and alpha-helix C. The Escherichia coli sequence GPESSGKT matches the consensus sequence of amino acids (G/A)XXXXGK(T/S) for the Walker A box (also referred to as the P-loop) found in a number of nucleoside triphosphate (NTP)-binding proteins. Another nucleotide binding motif, the Walker B box is found at ß-strand 4 in the RecA structure. The Walker B box is characterised by four hydrophobic amino acids followed by an acidic residue (usually aspartate). Nucleotide specificity and additional ATP binding interactions are contributed by the amino acid residues at ß-strand 2 and the loop C-terminal to that strand, all of which are greater than 90% conserved among bacterial RecA proteins.

    Proteins where this domain is known:
    MAL8P1.76    PF11_0087   


    PS50166 - IMPORTIN_B_NT (Prosite link)

    Interpro entry IPR001494 : Importin-beta, N-terminal (Interpro link)

    Interpro description:

    The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.

    Members of the importin-beta (karyopherin-beta) family can bind and transport cargo by themselves, or can form heterodimers with importin-alpha. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Importin-beta is a helicoidal molecule constructed from 19 HEAT repeats. Many nuclear pore proteins contain FG sequence repeats that can bind to HEAT repeats within importins, which is important for importin-beta mediated transport.

    Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. As a result, the N-terminal auto-inhibitory region on importin-alpha is free to loop back and bind to the major NLS-binding site, causing the cargo to be released. There are additional release factors as well.

    More information about these proteins can be found at Protein of the Month: Importins.

    Proteins where this domain is known:
    MAL7P1.202    PF08_0069    PF14_0304    PFC0135c    PFF1345w   


    PS50171 - ZF_MATRIN (Prosite link)

    Interpro entry IPR000690 : Zinc finger, C2H2-type matrin (Interpro link)

    Interpro description:

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    A specific C2H2 Zn-finger is conserved in matrin and several RNA-binding proteins. The Zn-finger follows the general pattern C-x2-C-x(12,16)-H-x5-H, and is different from the 'classical' DNA-binding C2H2 Zn-finger.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0084    PF14_0026    PFF0970w    PFI1215w   


    PS50172 - BRCT (Prosite link)

    Interpro entry IPR001357 : BRCT (Interpro link)

    Interpro description:

    The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.

    A chitin biosynthesis protein from yeast also seems to belong to this group.

    Proteins where this domain is known:
    MAL13P1.275    PF11_0090    PF13_0126    PFB0895c    PFI0510c   


    PS50173 - UMUC (Prosite link)

    Interpro entry IPR017963 : DNA-repair protein, UmuC-like, N-terminal (Interpro link)

    Interpro description:

    This entry represents the N-terminal domain of UmuC-like DNA repair proteins. In Escherichia coli, UV and many chemicals appear to cause mutagenesis by a process of translesion synthesis that requires DNA polymerase III and the SOS-regulated proteins UmuD, UmuC and RecA. This machinery allows the replication to continue through DNA lesion, and therefore avoid lethal interruption of DNA replication after DNA damage. UmuC is a well conserved protein in prokaryotes, with a homologue in yeast species.

    Proteins currently known to belong to this family are listed below:

    Proteins where this domain is known:
    PFI0510c   


    PS50174 - G_PATCH (Prosite link)

    Interpro entry IPR000467 : D111/G-patch (Interpro link)

    Interpro description:
    The D111/G-patch domain is a short conserved region of about 40 amino acids which occurs in a number of putative RNA-binding proteins, including tumor suppressor and DNA-damage-repair proteins, suggesting that this domain may have an RNA binding function. This domain has seven highly conserved glycines. A multiple alignment of a small subset of D111/G-patch domains is shown in Fig. 2b of.

    Proteins where this domain is known:
    PF14_0513   


    PS50176 - ARM_REPEAT (Prosite link)

    Interpro entry IPR000225 : (Interpro link)

    Interpro description:

    The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions.

    The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.

    Proteins where this domain is known:
    MAL13P1.308    PF08_0087   


    PS50177 - NTF2_DOMAIN (Prosite link)

    Interpro entry IPR018222 : (Interpro link)

    Interpro description:

    Ran is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Import receptors bind their cargos in the cytoplasm where the concentration of RanGTP is low and release their cargos in the nucleus where the concentration of RanGTP is high. Export receptors respond to Ran GTP in the opposite manner.

    Nuclear transport factor 2 (NTF2) is a homodimer of approximately 14kDa subunits which stimulates efficient nuclear import of a cargo protein. NTF2 binds to both RanGDP and FxFG repeat-containing nucleoporins. NTF2 binds to RanGDP sufficiently strongly for the complex to remain intact during transport through NPCs, but the interaction between NTF2 and FxFG nucleoporins is much more transient, which would enable NTF2 to move through the NPC by hopping from one repeat to another.

    NTF2 folds into a cone with a deep hydrophobic cavity, the opening of which is surrounded by several negatively charged residues. RanGDP binds to NTF2 by inserting a conserved phenylalanine residue into the hydrophobic pocket of NTF2 and making electrostatic interactions with the conserved negatively charged residues that surround the cavity.

    This entry contains predominantly eukaryotic proteins. The following proteins contain a region similar to NTF2:

    Proteins where this domain is known:
    PF14_0122    PF14_0228   


    PS50178 - ZF_FYVE (Prosite link)

    Interpro entry IPR017455 : (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two Zn2+ ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF13_0055    PF14_0574   


    PS50179 - VHS (Prosite link)

    Interpro entry IPR002014 : VHS (Interpro link)

    Interpro description:

    The VHS domain is a ~140 residues long domain, whose name is derived from its occurrence in VPS-27, Hrs and STAM. Based on regions surrounding the domain, VHS-proteins can be divided into 4 groups:

    The VHS domain is always found at the N- terminus of proteins suggesting that such topology is important for function. The domain is considered to have a general membrane targeting/cargo recognition role in vesicular trafficking.

    Resolution of the crystal structure of the VHS domain of Drosophila Hrs and human Tom1 revealed that it consists of eight helices arranged in a double-layer superhelix. The existence of conserved patches of residues on the domain surface suggests that VHS domains may be involved in protein-protein recognition and docking. Overall, sequence similarity is low (approx 25%) amongst domain family members.

    Proteins where this domain is known:
    PF13_0041   


    PS50180 - GAE (Prosite link)

    Interpro entry IPR008153 : Clathrin adaptor, gamma-adaptin, appendage (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of gamma1-adaptin from AP1 clathrin adaptor complex, and the homologous C-terminal GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 8 strands in 2 beta-sheets in a Greek key topology. This is a similar fold to that found in alpha- and beta-adaptins, but there is little sequence identity between them. The GAE domain is involved in the recruitment of accessory proteins, such as gamma-synergin, Rababptin-5, Eps15 and cyclin G-associated kinase, which modulate the functions of GAE domain containing proteins in the membrane trafficking events. The binding site in GAE for accessory proteins is located in a shallow hydrophobic trough surrounded by charged (mainly basic) residues.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF14_0529   


    PS50186 - DEP (Prosite link)

    Interpro entry IPR000591 : Pleckstrin/G-protein, interacting region (Interpro link)

    Interpro description:

    This is a domain of unknown function present in signalling proteins including dishevelled, Egl-10, and pleckstrin proteins. Segment polarity dishevelled protein is required to establish coherent arrays of polarized cells and segments in embryos, and plays a role in wingless signalling. Egl-10 regulates G-protein signalling in the central nervous system. Mammalian regulators of G-protein signalling also contain these domains, and regulate signal transduction by increasing the GTPase activity of G-protein alpha subunits, thereby driving them into their inactive GDP-bound form.

    Proteins where this domain is known:
    PF14_0361   


    PS50188 - B302_SPRY (Prosite link)

    Interpro entry IPR001870 : (Interpro link)

    Interpro description:

    The B30.2-like domain is a conserved domain of 160-170 amino acids which is found in nuclear and cytoplasmic proteins, as well as transmembrane and secreted proteins. It was named after the B30-2 exon which maps within the Homo sapiens (Human) class I histocompatibility complex region and codes for a 166-amino-acid peptide similar to the C-terminal domain of human Sjoegren's syndrome nuclear antigen A/Ro (SS-A/Ro), ret finger protein (RFP), Xenopus laevis nuclear factor 7 (XNF7), and Bos taurus (Bovine) butyrophilin. The B30.2-like domain is found associated with different N-terminal domains: immunoglobulin domain in the case of butyrophilin, zinc-binding B-box domain in the case of RFP and SS-A/Ro and leucine zipper in the case of enterophilin. The function of the B30.2-like domain is not known, but the cytoplasmic B30.2-like domain of butyrophilin has been shown to interact with xanthine oxidase.

    Proteins where this domain is known:
    PF08_0021    PF10_0140    PFE1085w   


    PS50190 - SEC7 (Prosite link)

    Interpro entry IPR000904 : SEC7-like (Interpro link)

    Interpro description:
    The SEC7 domain was named after the first protein found to contain such a region. It has been shown to be linked with guanine nucleotide exchange function. The 3D structure of the domain displays several alpha-helices. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian factors.

    Proteins where this domain is known:
    PF14_0407   


    PS50191 - CRAL_TRIO (Prosite link)

    Interpro entry IPR001251 : (Interpro link)

    Interpro description:
    This entry defines the C-terminal of various retinaldehyde/retinal-binding proteins that may be functional components of the visual cycle. Cellular retinaldehyde-binding protein (CRALBP) carries 11-cis-retinol or 11-cis-retinaldehyde as endogenous ligands and may function as a substrate carrier protein that modulates interaction of these retinoids with visual cycle enzymes. The multidomain protein Trio binds the LAR transmembrane tyrosine phosphatase, contains a protein kinase domain, and has separate rac-specific and rho-specific guanine nucleotide exchange factor domains. Trio is a multifunctional protein that integrates and amplifies signals involved in coordinating actin remodeling, which is necessary for cell migration and growth.

    Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.

    Proteins where this domain is known:
    MAL7P1.83    PF11_0287    PFF1280w    PFF1450w    PFI1015w   


    PS50192 - T_SNARE (Prosite link)

    Interpro entry IPR000727 : (Interpro link)

    Interpro description:

    The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), which are associated with the fusing membranes. Target SNAREs (t-SNAREs) are localised on the target membrane and belong to two different families, the syntaxin-like family and the SNAP-25 like family. One member of each family, together with a v-SNARE localised on the vesicular membrane, are required for fusion.

    The Syntaxins are type-I transmembrane proteins that contain several regions with coiled-coil propensity in their cytosolic part, the SNARE motif. SNAP-25 is a protein consisting of two coiled-coil regions, which is associated with the membrane by lipid anchors. SNARE motifs assemble into parallel four helix bundles stabilised by the burial of these hydrophobic helix faces in the bundle core. Monomeric SNARE motifs are disordered so this assembly reaction is accompanied by a dramatic increase in alpha-helical secondary structure. The parallel arrangement of SNARE motifs within complexes bring the transmembrane anchors, and the two membranes, into close proximity. Recently, it was shown that the two coiled-coil regions of SNAP-25 and one of the coiled-coil regions of the syntaxins are related. This domain is found in both Syntaxin and SNAP-25 families as well as in other proteins.

    Proteins where this domain is known:
    MAL13P1.113    MAL13P1.169    MAL13P1.365    PF11_0052    PF14_0300    PF14_0464    PF14_0500    PF14_0535    PFB0480w    PFE1505w    PFL0505c    PFL2070w   


    PS50194 - FILAMIN_REPEAT (Prosite link)

    Interpro entry IPR017868 : (Interpro link)

    Interpro description:

    The many different actin cross-linking proteins share a common architecture, consisting of a globular actin-binding domain and an extended rod. Whereas their actin-binding domains consist of two calponin homology domains (see, their rods fall into three families.

    The rod domain of the family including the Dictyostelium discoideum (Slime mould) gelation factor (ABP120) and human filamin (ABP280) is constructed from tandem repeats of a 100-residue motif that is glycine and proline rich. The gelation factor's rod contains 6 copies of the repeat, whereas filamin has a rod constructed from 24 repeats. The resolution of the 3D structure of rod repeats from the gelation factor has shown that they consist of a beta-sandwich, formed by two beta-sheets arranged in an immunoglobulin-like fold. Because conserved residues that form the core of the repeats are preserved in filamin, the repeat structure should be common to the members of the gelation factor/filamin family.

    The head to tail homodimerisation is crucial to the function of the ABP120 and ABP280 proteins. This interaction involves a small portion at the distal end of the rod domains. For the gelation factor it has been shown that the carboxy-terminal repeat 6 dimerises through a double edge-to-edge extension of the beta-sheet and that repeat 5 contributes to dimerisation to some extent.

    Proteins where this domain is known:
    PF11_0158    PFF0685c   


    PS50195 - PX (Prosite link)

    Interpro entry IPR001683 : Phox-like (Interpro link)

    Interpro description:

    The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.

    The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.

    Proteins where this domain is known:
    MAL7P1.108    PF07_0017   


    PS50196 - RANBD1 (Prosite link)

    Interpro entry IPR000156 : Ran Binding Protein 1 (Interpro link)

    Interpro description:

    Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran Binding Protein 1 (RanBP1) has guanine nucleotide dissociation inhibitory activity, specific for the GTP form of Ran and also functions to stimulate Ran GTPase activating protein(GAP)-mediated GTP hydrolysis by Ran. RanBP1 contributes to maintaining the gradient of RanGTP across the nuclear envelope high (GDI activity) or the cytoplasmic levels of RanGTP low (GAP cofactor).

    All RanBP1 proteins contain an approx 150 amino acid residue Ran binding domain. Ran BP1 binds directly to RanGTP with high affinity. There are four sites of contact between Ran and the Ran binding domain. One of these involves binding of the C-terminal segment of Ran to a groove on the Ran binding domain that is analogous to the surface utilised in the EVH1Âpeptide interaction. Nup358 contains four Ran binding domains. The structure of the first of these is known.

    Proteins where this domain is known:
    PFD0950w   


    PS50197 - BEACH (Prosite link)

    Interpro entry IPR000409 : (Interpro link)

    Interpro description:

    The "beige" mouse is established as an animal model of Chediak-Higashi Syndrome (CHS). The BEACH domain was described in the BEIGE protein (D1035670) and in the highly homologous CHS protein It is also found in distantly related proteins like, for example,andwhich are factor associated with neutral sphingomyelinase activation.

    The BEACH domain is usually followed by a series of WD repeats. The function of the BEACH domain is unknown.

    Proteins where this domain is known:
    PF11_0252   


    PS50199 - ZF_RANBP2_2 (Prosite link)

    Interpro entry IPR001876 : Zinc finger, RanBP2-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the zinc finger domain found in RanBP2 proteins. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran binding protein 2 (RanBP2) is a 358-kDa nucleoporin located on the cytoplasmic side of the nuclear pore complex which plays a role in nuclear protein import. RanBP2 contains multiple zinc fingers which mediate binding to RanGDP.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF13_0099    PF13_0278    PFD0405c   


    PS50202 - MSP (Prosite link)

    Interpro entry IPR000535 : Major sperm protein (Interpro link)

    Interpro description:

    Major sperm proteins (MSP) are central components in molecular interactions underlying sperm motility in Caenorhabditis elegans, whose sperm employ an amoebae-like crawling motion using a MSP-containing lamellipod, rather than the flagellar-based swimming motion associated with other sperm. These proteins oligomerise to form an extensive filament system that extends from sperm villipoda, along the leading edge of the pseudopod. About 30 MSP isoforms may exist in C. elegans.

    MSPs form a fibrous network, whereby MSP dimers form helical subfilaments that coil around one another to produce filaments, which in turn form supercoils to produce bundles. The crystal structure of MSP from C. elegans reveals an immunoglobulin (Ig)-like seven-stranded beta sandwich fold.

    Proteins where this domain is known:
    PF14_0377   


    PS50203 - CALPAIN_CAT (Prosite link)

    Interpro entry IPR001300 : Peptidase C2, calpain (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.

    All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:

    1. A 19-amino acid NH2-terminal sequence;
    2. Active site domain IIa;
    3. Active site domain IIb.

      Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.

    4. Domain III;
    5. An 18-amino acid extended sequence linking domain III to domain IV;
    6. Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity. />. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad.

    Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.

    Proteins where this domain is known:
    MAL13P1.310   


    PS50206 - RHODANESE_3 (Prosite link)

    Interpro entry IPR001763 : (Interpro link)

    Interpro description:

    Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including

    Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.

    Proteins where this domain is known:
    PF13_0027    PFL0320w   


    PS50213 - FAS1 (Prosite link)

    Interpro entry IPR000782 : (Interpro link)

    Interpro description:

    The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals; related FAS1 domains are also found in bacteria.

    The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.

    FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.

    Proteins known to contain a FAS1 domain include:

    The FAS1 domains of both human periostin and BIgH3 proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues. Gamma-carboxyglutamate residues are more commonly associated with GLA domains, where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.

    Proteins where this domain is known:
    PF14_0446   


    PS50216 - ZF_DHHC (Prosite link)

    Interpro entry IPR001594 : Zinc finger, DHHC-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the DHHC-type zinc finger domain, which is also known as NEW1. The DHHC Zn-finger was first isolated in the Drosophila putative transcription factor DNZ1 . The function of this domain is unknown, but it has been predicted to be involved in protein-protein or protein-DNA interactions.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.117    MAL13P1.126    MAL7P1.68    PF10_0273    PF11_0167    PF11_0217    PFB0140w    PFB0725c    PFC0160w    PFE1415w    PFF0485c    PFI1580c   


    PS50222 - EF_HAND_2 (Prosite link)

    Interpro entry IPR018249 : (Interpro link)

    Interpro description:
    Many calcium-binding proteins belong to the same evolutionary family and share a type of calcium-binding domain known as the EF-hand. This type of domain consists of a twelve residue loop flanked on both side by a twelve residue alpha-helical domain. In an EF-hand loop the calcium ion is coordinated in a pentagonal bipyramidal configuration. The six residues involved in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, -X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding Ca (bidentate ligand).

    Proteins where this domain is known:
    MAL13P1.156    MAL7P1.10    MAL7P1.69    MAL8P1.79    PF07_0072    PF10_0177    PF10_0244    PF10_0271    PF10_0301    PF11_0066    PF11_0098    PF11_0239    PF11_0242    PF11_0389    PF13_0211    PF14_0181    PF14_0224    PF14_0323    PF14_0420    PF14_0443    PF14_0492    PF14_0607    PFA0345w    PFA0515w    PFB0815w    PFC0420w    PFD0692c    PFF0265c    PFF0520w    PFF1320c    PFL2225w   


    PS50228 - SUEL_LECTIN (Prosite link)

    Interpro entry IPR000922 : D-galactoside/L-rhamnose binding SUEL lectin (Interpro link)

    Interpro description:

    The D-galactoside binding lectin purified from sea urchin (Anthocidaris crassispina) eggs exists as a disulphide-linked homodimer of two subunits; the dimeric form is essential for hemagglutination activity. The sea urchin egg lectin (SUEL) forms a new class of lectins. Although SUEL was first isolated as a D-galactoside binding lectin, it was latter shown that it bind to L-rhamnose preferentially. L-rhamnose and D-galactose share the same hydroxyl group orientation at C2 and C4 of the pyranose ring structure.

    A cysteine-rich domain homologous to the SUEL protein has been identified in the following proteins:

    Proteins where this domain is known:
    MAL13P1.154   


    PS50231 - RICIN_B_LECTIN (Prosite link)

    Interpro entry IPR000772 : (Interpro link)

    Interpro description:
    Ricin is a legume lectin from the seeds of the castor bean plant, Ricinus communis. The seeds are poisonous to people, animals and insects and just one milligram of ricin can kill an adult.

    Primary structure analysis has shown the presence of a similar domain in many carbohydrate-recognition proteins like plant and bacterial AB-toxins, glycosidases or proteases. This domain, known as the ricin B lectin domain, can be present in one or more copies and has been shown in some instance to bind simple sugars, such as galactose or lactose.

    The ricin B lectin domain is composed of three homologous subdomains of 40 amino acids (alpha, beta and gamma) and a linker peptide of around 15 residues (lambda). It has been proposed that the ricin B lectin domain arose by gene triplication from a primitive 40 residue galactoside-binding peptide. The most characteristic, though not completely conserved, sequence feature is the presence of a Q-W pattern. Consequently, the ricin B lectin domain as also been refered as the (QxW)3 domain and the three homologous regions as the QxW repeats. A disulphide bond is also conserved in some of the QxW repeats.

    The 3D structure of the ricin B chain has shown that the three QxW repeats pack around a pseudo threefold axis that is stabilised by the lambda linker. The ricin B lectin domain has no major segments of a helix or beta sheet but each of the QxW repeats contains an omega loop. An idealized omega-loop is a compact, contiguous segment of polypeptide that traces a 'loop-shaped' path in three-dimensional space; the main chain resembles a Greek omega.

    Proteins where this domain is known:
    PF14_0532    PF14_0723    PFI0185w   


    PS50234 - VWFA (Prosite link)

    Interpro entry IPR002035 : (Interpro link)

    Interpro description:
    The von Willebrand factor is a large multimeric glycoprotein found in blood plasma. Mutant forms are involved in the aetiology of bleeding disorders . In von Willebrand factor, the type A domain (vWF) is the prototype for a protein superfamily. The vWF domain is found in various plasma proteins: complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen types VI, VII, XII and XIV; and other extracellular proteins. Although the majority of VWA-containing proteins are extracellular, the most ancient ones present in all eukaryotes are all intracellular proteins involved in functions such as transcription, DNA repair, ribosomal and membrane transport and the proteasome. A common feature appears to be involvement in multiprotein complexes. Proteins that incorporate vWF domains participate in numerous biological events (e.g. cell adhesion, migration, homing, pattern formation, and signal transduction), involving interaction with a large array of ligands. A number of human diseases arise from mutations in VWA domains. Secondary structure prediction from 75 aligned vWF sequences has revealed a largely alternating sequence of alpha-helices and beta-strands. Fold recognition algorithms were used to score sequence compatibility with a library of known structures: the vWF domain fold was predicted to be a doubly-wound, open, twisted beta-sheet flanked by alpha-helices. 3D structures have been determined for the I-domains of integrins CD11b (with bound magnesium) and CD11a (with bound manganese). The domain adopts a classic alpha/beta Rossmann fold and contains an unusual metal ion coordination site at its surface. It has been suggested that this site represents a general metal ion-dependent adhesion site (MIDAS) for binding protein ligands. The residues constituting the MIDAS motif in the CD11b and CD11a I-domains are completely conserved, but the manner in which the metal ion is coordinated differs slightly.

    Proteins where this domain is known:
    MAL13P1.76    PF08_0136b    PF13_0201    PF14_0326    PFC0640w    PFF0800w   


    PS50235 - UCH_2_3 (Prosite link)

    Interpro entry IPR001394 : Peptidase C19, ubiquitin carboxyl-terminal hydrolase 2 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C19 (ubiquitin-specific protease family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. Predicted active site residues for members of this family and family C1 occur in the same order in the sequence: N/Q, C, H. The type example is human ubiquitin-specific protease 14.

    Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.

    The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 100-200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.

    Proteins where this domain is known:
    MAL7P1.147    PF13_0096    PF14_0145    PFA0220w    PFD0165w    PFD0655c    PFE0835w    PFE1355c    PFI0225w   


    PS50237 - HECT (Prosite link)

    Interpro entry IPR000569 : HECT (Interpro link)

    Interpro description:

    The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.

    Proteins where this domain is known:
    MAL7P1.19    MAL8P1.23    PF11_0201    PFF1365c   


    PS50238 - RHOGAP (Prosite link)

    Interpro entry IPR000198 : RhoGAP (Interpro link)

    Interpro description:
    Members of the Rho family of small G proteins transduce signals from plasma-membrane receptors and control cell adhesion, motility and shape by actin cytoskeleton formation. Like all other GTPases, Rho proteins act as molecular switches, with an active GTP-bound form and an inactive GDP-bound form. The active conformation is promoted by guanine-nucleotide exchange factors, and the inactive state by GTPase-activating proteins (GAPs) which stimulate the intrinsic GTPase activity of small G proteins. This entry is a Rho/Rac/Cdc42-like GAP domain, that is found in a wide variety of large, multi-functional proteins. A number of structure are known for this family. The domain is composed of seven alpha helices. This domain is also known as the breakpoint cluster region-homology (BH) domain.

    Proteins where this domain is known:
    PF10_0071   


    PS50244 - S5A_REDUCTASE (Prosite link)

    Interpro entry IPR001104 : 3-oxo-5-alpha-steroid 4-dehydrogenase, C-terminal (Interpro link)

    Interpro description:

    Synonym(s): Steroid 5-alpha-reductase

    3-oxo-5-alpha-steroid 4-dehydrogenases,catalyse the conversion of 3-oxo-5-alpha-steroid + acceptor to 3-oxo-delta(4)-steroid + reduced acceptor. The steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during foetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related steroid reductase enzyme, DET2, is found in plants such as Arabidopsis. Mutations in this enzyme cause defects in light-regulated development. This domain is present in both type 1 and type 2 forms.

    Proteins where this domain is known:
    PF14_0791   


    PS50245 - CAP_GLY_2 (Prosite link)

    Interpro entry IPR000938 : (Interpro link)

    Interpro description:

    Cytoskeleton-associated proteins (CAP) are made of three distinct parts, an N-terminal section that is most probably globular and contains the CAP-Gly domain, a large central region predicted to be in an alpha-helical coiled-coil conformation and, finally, a short C-terminal globular domain. The CAP-Gly domain is a conserved, glycine-rich domain of about 42 residues found in some CAPs. Proteins known to contain this domain include restin (also known as cytoplasmic linker protein-170 or CLIP-170), a 160 kDa protein associated with intermediate filaments and that links endocytic vesicles to microtubules; vertebrate dynactin (150 kDa dynein-associated polypeptide; DAP) and Drosophila glued, a major component of activator I; yeast protein BIK1, which seems to be required for the formation or stabilisation of microtubules during mitosis and for spindle pole body fusion during conjugation; yeast protein NIP100 (NIP80); human protein CKAP1/TFCB; Schizosaccharomyces pombe protein alp11 and Caenorhabditis elegans hypothetical protein F53F4.3. The latter proteins contain a N-terminal ubiquitin domain and a C-terminal CAP-Gly domain.

    The crystal structure of the CAP-Gly domain of C. elegans F53F4.3 protein, solved by single wavelength sulphur-anomalous phasing, revealed a novel protein fold containing three beta-sheets. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove. Residues in the groove are highly conserved as measured from the information content of the aligned sequences. The C-terminal tail of another molecule in the crystal is bound in this groove.

    Proteins where this domain is known:
    PFI0335w   


    PS50255 - CYTOCHROME_B5_2 (Prosite link)

    Interpro entry IPR001199 : Cytochrome b5 (Interpro link)

    Interpro description:
    Cytochromes b5 are ubiquitous electron transport proteins found in animals, plants and yeasts. The microsomal and mitochondrial variants are membrane-bound, while those from erythrocytes and other animal tissues are water-soluble.

    The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.

    Proteins where this domain is known:
    PF14_0266    PFI0885w    PFL1555w   


    PS50267 - NA_NEUROTRAN_SYMP_3 (Prosite link)

    Interpro entry IPR000175 : Sodium:neurotransmitter symporter (Interpro link)

    Interpro description:

    Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+ and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+ and the counter-transport of K+, with no apparent dependence on Cl-. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters.

    Sequence analysis of the Na+/Cl- neurotransmitter superfamily reveals that it can be divided into four subfamilies, these being transporters for monoamines, the amino acids proline and glycine, GABA, and a group of orphan transporters.

    Proteins where this domain is known:
    PF11_0334    PFB0435c    PFE0775c   


    PS50271 - ZF_UBP (Prosite link)

    Interpro entry IPR001607 : Zinc finger, UBP-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP), All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties.

    Some of the proteins containing an UBP zinc finger include:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL7P1.120    PF13_0096    PFD0655c   


    PS50275 - SAC (Prosite link)

    Interpro entry IPR002013 : (Interpro link)

    Interpro description:
    Synaptic vesicles are recycled with remarkable speed and precision in nerve terminals. A major recycling pathway involves clathrin-mediated endocytosis at endocytic zones located around sites of release. Different 'accessory' proteins linked to this pathway have been shown to alter the shape and composition of lipid membranes, to modify membrane-coat protein interactions, and to influence actin polymerization. These include the GTPase dynamin, the lysophosphatidic acid acyl transferase endophilin, and the phosphoinositide phosphatase synaptojanin.

    The recessive suppressor of secretory defect in yeast Golgi and yeast actin function belongs to this family. This protein may be involved in the coordination of the activities of the secretory pathway and the actin cytoskeleton.

    Human synaptojanin which may be localised on coated endocytic intermediates in nerve terminals also belongs to this family.

    Proteins where this domain is known:
    MAL8P1.151    PF07_0024    PF13_0285   


    PS50280 - SET (Prosite link)

    Interpro entry IPR001214 : (Interpro link)

    Interpro description:

    The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure.

    Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.

    The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.

    The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.

    Proteins where this domain is known:
    MAL13P1.122    PF08_0012    PF13_0293    PFD0190w    PFF1440w    PFI0485c    PFL0690c   


    PS50287 - SRCR_2 (Prosite link)

    Interpro entry IPR001190 : Speract/scavenger receptor (Interpro link)

    Interpro description:

    The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).

    Proteins where this domain is known:
    PF14_0067   


    PS50290 - PI3_4_KINASE_3 (Prosite link)

    Interpro entry IPR000403 : Phosphatidylinositol 3- and 4-kinase, catalytic (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.

    The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.

    Proteins where this domain is known:
    PFD0965W    PFE0485w    PFE0765w   


    PS50293 - TPR_REGION (Prosite link)

    Interpro entry IPR013026 : (Interpro link)

    Interpro description:

    The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins. It mediates proteinÂprotein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3Â16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.

    The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helixÂturnÂhelix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24° within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.

    Proteins where this domain is known:
    MAL13P1.139    MAL13P1.18    MAL13P1.274    MAL13P1.52    MAL8P1.60    PF07_0026    PF11_0124    PF13_0190    PF13_0231    PF14_0031    PF14_0098    PF14_0196    PF14_0324    PFC0515c    PFD0180c    PFE0085c    PFE1370w    PFF0080c    PFF1505w    PFI1060w    PFL0615w    PFL2015w    PFL2120w    PFL2275c   


    PS50294 - WD_REPEATS_REGION (Prosite link)

    Interpro entry IPR017986 : (Interpro link)

    Interpro description:

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    Proteins where this domain is known:
    MAL13P1.148    MAL13P1.245    MAL13P1.264    MAL13P1.385    MAL13P1.54    MAL7P1.81    MAL8P1.145    MAL8P1.43    PF07_0017    PF07_0092    PF08_0019    PF08_0065    PF08_0130    PF08_0135    PF10_0128    PF10_0196    PF10_0261    PF10_0326    PF11_0056    PF11_0171    PF11_0222    PF11_0400    PF11_0471    PF13_0149    PF13_0250    PF13_0335    PF14_0087    PF14_0101    PF14_0243    PF14_0263    PF14_0314    PF14_0412    PF14_0456    PFA0520c    PFB0640c    PFC0100c    PFC0365w    PFC0965w    PFD0455w    PFE0090w    PFE0505w    PFE0540w    PFE0930w    PFF0330w    PFF0395c    PFF1000w    PFF1480w    PFI0290c    PFI1080w    PFL0470w    PFL0610w    PFL0970w    PFL1040w    PFL1290w    PFL1395c    PFL1470c    PFL1480w    PFL1820w    PFL1975c    PFL2460w   


    PS50296 - SUI1 (Prosite link)

    Interpro entry IPR001950 : Translation initiation factor SUI1 (Interpro link)

    Interpro description:
    In Saccharomyces cerevisiae (Baker's yeast), SUI1 is a translation initiation factor that functions in concert with eIF-2 and the initiator tRNA-Met in directing the ribosome to the proper start site of translation. SUI1 is a protein of 108 residues. Close homologs of SUI1 have been found in mammals, insects and plants. SUI1 is also evolutionary related to hypothetical proteins from Escherichia coli (yciH), Haemophilus influenzae (HI1225) and Methanococcus vannielii.

    Proteins where this domain is known:
    PF08_0079    PFI0365w    PFL2095w   


    PS50297 - ANK_REP_REGION (Prosite link)

    Interpro entry IPR002110 : (Interpro link)

    Interpro description:

    The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.

    The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.

    Proteins where this domain is known:
    MAL13P1.126    MAL13P1.71    MAL8P1.28    PF10_0102    PF10_0213    PF10_0328    PF11_0197    PF11_0439    PF14_0106    PF14_0222    PF14_0690    PFC0160w    PFE0400w    PFF1315w    PFF1365c    PFL2200w   


    PS50302 - PUM (Prosite link)

    Interpro entry IPR001313 : Pumilio RNA-binding region (Interpro link)

    Interpro description:

    The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.

    In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.

    Proteins where this domain is known:
    PFD0825c    PFE0935c   


    PS50303 - PUM_HD (Prosite link)

    Interpro entry IPR001313 : Pumilio RNA-binding region (Interpro link)

    Interpro description:

    The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.

    In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.

    Proteins where this domain is known:
    PF10_0351    PFD0825c    PFE0935c    PFF1030w   


    PS50304 - TUDOR (Prosite link)

    Interpro entry IPR018351 : (Interpro link)

    Interpro description:

    The drosophila tudor protein is encoded by a 'posterior group' gene, which when mutated disrupt normal abdominal segmentation and pole cell formation. Another drosophila gene, homeless, is required for RNA localization during oogenesis. The tudor protein contains multiple repeats of a domain which is also found in homeless.

    The tudor domain is found in many proteins that colocalise with ribonucleoprotein or single-strand DNA-associated complexes in the nucleus, in the mitochondrial membrane, or at kinetochores. It is not known whether the domain binds directly to RNA and ssDNA, or controls interactions with the nucleoprotein complexes. At least one tudor-containing protein, homeless, also contains a zinc finger typical of RNA-binding proteins.

    The resolution of the solution structure of the Tudor domain of human SMN revealed that the Tudor domain forms a strongly bent antiparallel beta-sheet with five strands forming a barrel-like fold. The structure exhibits a conserved negatively charged surface that interacts with the C-terminal Arg and Gly-rich tails of the spliceosomal Sm D1 and D3 proteins.

    Proteins where this domain is known:
    PF11_0374    PFC1050w   


    PS50305 - SIRTUIN (Prosite link)

    Interpro entry IPR003000 : NAD-dependent histone deacetylase, silent information regulator Sir2 (Interpro link)

    Interpro description:
    These sequences represent the Sir2 family of NAD+-dependent deacetylases. Silent Information Regulator protein of Saccharomyces cerevisiae (Sir2p) is one of several factors critical for silencing at least three loci. Among them, it is unique because it silences the rDNA as well as the mating type loci and telomeres. Sir2p interacts in a complex with itself and with Sir3p and Sir4p, two proteins that are able to interact with nucleosomes. In addition Sir2p also interacts with ubiquitination factors and/or complexes. Unlike Sir3p and Sir4p, for which no homologues are known, Sir2p is part of a multigene family in yeast, the homolgues being HST1, HST2, HST3 and HST4. Highly conserved structural homologues also occur in other organisms ranging from bacteria to man and plants. Proteins of this family have been proposed to play a role in silencing, chromosome stability and agein. In addition, an in vitro ADP ribosyltransferase activity has been associated with Escherichia coli and human members of this family. Homologues of Sir2 share a core domain including the GAG and NID motifs and a putative C4 Zinc finger. The regions containing these three conserved motifs are individually essential for Sir2 silencing function, as are the four cysteins. In addition, the conserved residues HG next to the putative Zn finger have been shown to be essential for the ADP ribosyltransferase activity. Sir2-like enzymes catalyze a reaction in which the cleavage of NAD(+)and histone and/or protein deacetylation are coupled to the formation of O-acetyl-ADP-ribose, a novel metabolite. The dependence of the reaction on both NAD(+) and the generation of this potential second messenger offers new clues to understanding the function and regulation of nuclear, cytoplasmic and mitochondrial Sir2-like enzymes.

    Proteins where this domain is known:
    PF13_0152    PF14_0489   


    PS50309 - DC (Prosite link)

    Interpro entry IPR003533 : Doublecortin (Interpro link)

    Interpro description:

    X-linked lissencephaly is a severe brain malformation affecting males. Recently it has been demonstrated that the doublecortin gene is implicated in this disorder . Doublecortin was found to bind to the microtubule cytoskeleton. In vivo and in vitro assays show that Doublecortin stabilizes microtubules and causes bundling. Doublecortin is a basic protein with an iso-electric point of 10, typical of microtubule-binding proteins. However, its sequence contains no known microtubule-binding domain(s).

    The detailed sequence analysis of Doublecortin and Doublecortin-like proteins allowed the identification of an evolutionarily conserved Doublecortin (DC) domain. This domain is found in the N-terminus of proteins and consists of one or two tandemly repeated copies of an around 80 amino acids region. It has been suggested that the first DC domain of Doublecortin binds tubulin and enhances microtubule polymerization.

    Proteins where this domain is known:
    PFE0890c   


    PS50404 - GST_NTER (Prosite link)

    Interpro entry IPR004045 : (Interpro link)

    Interpro description:

    In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of Cephalopoda is also a GST.

    Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.

    Soluble GSTs activate glutathione (GSH) to GS-. In many GSTs, this is accomplished by a Tyr at H-bonding distance from the sulphur of GSH. These enzymes catalyse nucleophilic attack by reduced glutathione (GSH) on nonpolar compounds that contain an electrophilic carbon, nitrogen, or sulphur atom.

    Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold, with each monomer composed of two distinct domains. The N-terminal domain forms a thioredoxin-like fold that binds the glutathione moiety, while the C-terminal domain contains several hydrophobic alpha-helices that specifically bind hydrophobic substrates.

    This entry represents the N-terminal domain of GST.

    Proteins where this domain is known:
    PF13_0214    PF14_0187   


    PS50405 - GST_CTER (Prosite link)

    Interpro entry IPR017933 : (Interpro link)

    Interpro description:

    In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. The GST domain is also found in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of Cephalopoda is also a GST.

    Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.

    Soluble GSTs activate glutathione (GSH) to GS-. In many GSTs, this is accomplished by a Tyr at H-bonding distance from the sulphur of GSH. These enzymes catalyse nucleophilic attack by reduced glutathione (GSH) on nonpolar compounds that contain an electrophilic carbon, nitrogen, or sulphur atom.

    Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold, with each monomer composed of two distinct domains. The N-terminal domain forms a thioredoxin-like fold that binds the glutathione moiety, while the C-terminal domain contains several hydrophobic alpha-helices that specifically bind hydrophobic substrates.

    This entry represents the C-terminal domain of glutathione S-transferases, and a number of redox-regulated chloride ion channel proteins.

    Proteins where this domain is known:
    PF13_0214    PF14_0187   


    PS50600 - ULP_PROTEASE (Prosite link)

    Interpro entry IPR003653 : Peptidase C48, SUMO/Sentrin/Ubl1 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of proteins contain cysteine peptidases belonging to MEROPS peptidase family C48 (Ulp1 endopeptidase family, clan CE). The protein fold of the peptidase domain for members of this family resembles that of adenain, the type example for clan CE. This group of sequences also contains a number of hypothetical proteins, which have not yet been characterised, and non-peptidase homologues. These are proteins that have either been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for the catalytic activity of the peptidases in the family.

    The Ulp1 endopeptidase family contain the deubiquitinating enzymes (DUB) that can de-conjugate ubiquitin or ubiquitin-like proteins from ubiquitin-conjugated proteins. They can be classified in 3 families according to sequence homology: Ubiquitin carboxyl-terminal hydrolase (UCH) (see, Ubiquitin-specific processing protease (UBP) (see , and ubiquitin-like protease (ULP) specific for de-conjugating ubiquitin-like proteins. In contrast to the UBP pathway, which is very redundant (16 UBP enzymes in yeast), there are few ubiquitin-like proteases (only one in yeast, Ulp1).

    Ulp1 catalyses two critical functions in the SUMO/Smt3 pathway via its cysteine protease activity. Ulp1 processes the Smt3 C-terminal sequence (-GGATY) to its mature form (-GG), and it de-conjugates Smt3 from the lysine epsilon-amino group of the target protein.

    Crystal structure of yeast Ulp1 bound to Smt3 revealed that the catalytic and interaction interface is situated in a shallow and narrow cleft where conserved residues recognise the Gly-Gly motif at the C-terminal extremity of Smt3 protein. Ulp1 adopts a novel architecture despite some structural similarity with other cysteine protease. The secondary structure is composed of seven alpha helices and seven beta strands. The catalytic domain includes the central alpha helix, beta-strands 4 to 6, and the catalytic triad (Cys-His-Asp). This profile is directed against the C-terminal part of ULP proteins that displays full proteolytic activity.

    Proteins where this domain is known:
    MAL8P1.157    PFL1635w   


    PS50800 - SAP (Prosite link)

    Interpro entry IPR003034 : DNA-binding SAP (Interpro link)

    Interpro description:

    The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA binding domain found in diverse nuclear proteins involved in chromosomal organization, including in apoptosis. In yeast, SAP is found in the most distal N-terminal region of E3 SUMO-protein ligase SIZ1, where it is involved in nuclear localization.

    Proteins where this domain is known:
    MAL13P1.302    PFI0610w   


    PS50802 - OTU (Prosite link)

    Interpro entry IPR003323 : (Interpro link)

    Interpro description:

    This is a group of proteins found primarily in viruses, eukaryotes and in the pathogenic bacterium Chlamydia pneumoniae. In viruses they are annotated as replicase or RNA-dependent RNA polymerase. The eukaryotic sequences are related to the Ovarian Tumour (OTU) gene in Drosophila, cezanne deubiquitinating peptidase and tumor necrosis factor, alpha-induced protein 3 (MEROPS peptidase family C64) and otubain 1 and otubain 2 (MEROPS peptidase family C65).

    None of these proteins has a known biochemical function but low sequence similarity with the polyprotein regions of arteriviruses, and conserved cysteine and histidine, and possibly the aspartate, residues suggests that those not yet recognised as peptidases could possess cysteine protease activity.

    Proteins where this domain is known:
    PF10_0308    PF11_0428    PFI1135c   


    PS50815 - HORMA (Prosite link)

    Interpro entry IPR003511 : DNA-binding HORMA (Interpro link)

    Interpro description:
    The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. Hop1 is a meiosis-specific protein, Rev7 is required for DNA damage induced mutagenesis, and MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity.

    Proteins where this domain is known:
    PF10_0227    PF13_0050   


    PS50817 - INTEIN_N_TER (Prosite link)

    Interpro entry IPR006141 : Intein splicing site (Interpro link)

    Interpro description:

    Inteins, or protein introns, are parts of protein sequences that are post-translationally excised, their flanking regions (exteins) being spliced together to yield an additional protein product. This process is believed to be self-catalysed, apparently initiating at the C-terminal splice junction, where a conserved asparagine residue mediates the nucleophilic attack of the peptide bond between it and its neighbouring residue. Most inteins consist of two domains: One is involved in autocatalytic splicing, and the other is an endonuclease that is important in the spread of inteins.

    Inteins are between 134 and 608 amino acids long, and they are found in members of all three domains of life: eukaryotes, bacteria, and archaea, although most frequently in archaea. Inteins are found in proteins with diverse functions, including metabolic enzymes, DNA and RNA polymerases, proteases, ribonucleotide reductases, and the vacuolar-type ATPase. However, enzymes involved in DNA replication and repair appear to dominate. Inteins are found in conserved regions of conserved proteins and can be regarded as parasitic genetic elements.

    In most cases the intein seems to be an endonuclease which belongs to MEROPS peptidase family C46. It has been proposed that the splicing initiates at the C-terminal splice junction. The delta-nitrogen group of a conserved asparagine residue makes a nucleophilic attack on the peptide bond that links this asparagine to the next residue. The next residue (a Cys, Ser or Thr) is then free to attack the peptide bond at the N-terminal splice junction by a transpeptidation reaction that releases the intein and creates a new peptide bond. Such a mechanism is briefly schematised in the following figures.

    Inteins are difficult to identify from sequence data because they lie in the same reading frame as the spliced protein and they are characterised by only a few short conserved motifs: two of these are similar to the nonapeptide LAGLIDADG, which is diagnostic of certain homing endonucleases (mutation of one such motif causes loss of endonuclease activity, but not of the protein splicing function); another includes the C' splice site, mutations in which disable protein function.

    Proteins where this domain is known:
    PF11_0074   


    PS50820 - LCCL (Prosite link)

    Interpro entry IPR004043 : (Interpro link)

    Interpro description:

    The LCCL domain has been named after the best characterised proteins that were found to contain it, namely Limulus factor C, Coch-5b2 and Lgl1. It is an about 100 amino acids domain whose C-terminal part contains a highly conserved histidine in a conserved motif YxxxSxxCxAAVHxGVI. The LCCL module is thought to be an autonomously folding domain that has been used for the construction of various modular proteins through exon-shuffling. It has been found in various metazoan proteins in association with complement B-type domains, C-type lectin domains, von Willebrand type A domains, CUB domains, discoidin lectin domains or CAP domains. It has been proposed that the LCCL domain could be involved in lipopolysaccharide (LPS) binding. Secondary structure prediction suggests that the LCCL domain contains six beta strands and two alpha helices.

    Some proteins known to contain a LCCL domain include Limulus factor C, a LPS endotoxin-sensitive trypsin type serine protease which serves to protect the organism from bacterial infection; vertebrate cochlear protein cochlin or coch-5b2 (Cochlin is probably a secreted protein, mutations affecting the LCCL domain of coch-5b2 cause the deafness disorder DFNA9 in humans); and mammalian late gestation lung protein Lgl1, contains two tandem copies of the LCCL domain.

    Proteins where this domain is known:
    PF14_0067    PF14_0532    PF14_0723    PFA0445w    PFI0185w   


    PS50823 - KH_TYPE_2 (Prosite link)

    Interpro entry IPR004044 : K Homology, type 2 (Interpro link)

    Interpro description:

    The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. It is a domain of around 70 amino acids that is present in a wide variety of quite diverse nucleic acid-binding proteins. It has been shown to bind RNA. Like many other RNA-binding motifs, KH motifs are found in one or multiple copies (14 copies in chicken vigilin) and, at least for hnRNP K (three copies) and FMR-1 (two copies), each motif is necessary for in vitro RNA binding activity, suggesting that they may function cooperatively or, in the case of single KH motif proteins (for example, Mer1p), independently.

    According to structural analysis the KH domain can be separated in two groups. The first group or type-1 contain a beta-alpha-alpha-beta-beta-alpha structure, whereas in the type-2 the two last beta-sheet are located in the N terminal part of the domain (alpha-beta-beta-alpha-alpha-beta). Sequence similarity between these two folds are limited to a short region (VIGXXGXXI) in the RNA binding motif. This motif is located between helice 1 and 2 in type-1 and between helice 2 and 3 in type-2. Proteins known to contain a type-2 KH domain include eukaryotic and prokaryotic S3 family of ribosomal proteins, and the prokaryotic GTP-binding protein, era.

    Proteins where this domain is known:
    PF14_0627   


    PS50829 - GYF (Prosite link)

    Interpro entry IPR003169 : (Interpro link)

    Interpro description:

    The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function . It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important.

    Proteins where this domain is known:
    PF10_0183    PF13_0067    PFF0220w   


    PS50830 - TNASE_3 (Prosite link)

    Interpro entry IPR006021 : Staphylococcal nuclease (SNase-like) (Interpro link)

    Interpro description:

    Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.

    Proteins where this domain is known:
    PF11_0374   


    PS50832 - S1_IF1_TYPE (Prosite link)

    Interpro entry IPR006196 : S1, IF1 type (Interpro link)

    Interpro description:

    The S1 domain of around 70 amino acids, originally identified in ribosomal protein S1, is found in a large number of RNA-associated proteins. It has been shown that S1 proteins bind RNA through their S1 domains with some degree of sequence specificity. This type of S1 domain is found in translation initiation factor 1.

    The solution structure of one S1 RNA-binding domain from Escherichia coli polynucleotide phosphorylase has been determined. It displays some similarity with the cold shock domain (CSD). Both the S1 and the CSD domain consist of an antiparallel beta barrel of the same topology with 5 beta strands. This fold is also shared by many other proteins of unrelated function and is known as the OB fold. However, the S1 and CSD fold can be distinguished from the other OB folds by the presence of a short 3(10) helix at the end of strand 3. This unique feature is likely to form a part of the DNA/RNA-binding site.

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    PF11_0447    PF14_0658   


    PS50833 - BRIX (Prosite link)

    Interpro entry IPR007109 : (Interpro link)

    Interpro description:

    The Brix domain is found in a number of eukaryotic proteins including some from Saccharomyces cerevisiae and Homo sapiens, Arabidopsis thaliana Peter Pan-like protein and several hypothetical proteins.

    There are six (one archaean and five eukaryotic) protein families which have a similar domain architecture with a central globular Brix domain. They have an optional N- and obligatory C-terminal segments, which both have charged low-complexity regions.

    Proteins from the Imp4/Brix superfamily appear to be involved in ribosomal RNA processing, which essential for the functioning of all cells. The N- and C-terminal halves of a member of the superfamily, Mil, show significant structural similarity to one another. This suggests an origin by means of an ancestral duplication. Both halves have the same fold as the anticodon-binding domain of class IIa aminoacyl-tRNA synthetases, with greater conservation seen in the N-terminal half. Structural evidence suggests that the Imp4/Brix superfamily proteins could bind single-stranded segments of RNA along a concave surface formed by the N-terminal half of their beta-sheet and a central alpha-helix.

    Proteins where this domain is known:
    PF07_0122    PF08_0053    PF08_0055    PF10_0278    PFI1070c   


    PS50841 - DIX (Prosite link)

    Interpro entry IPR001158 : DIX (Interpro link)

    Interpro description:
    Dishevelled (Dsh) protein is an important component of the Wnt signal-transduction pathway. It has three relatively conserved domains: DIX, PDZ and DEP. The DIX domain of Dvl-1 (a mammalian Dishevelled homologue) shares 37% identity with the C-terminal region of Axin. Dsh can interact with the Axin/APC/GSK3/beta-catenin complex, and may thus modulate its activity.

    The Wnt signalling pathway is conserved in various species from Caenorhabditis elegans to mammals, and plays important roles in development, cellular proliferation, and differentiation. The molecular mechanisms by which the Wnt signal regulates cellular functions are becoming increasingly well understood. Wnt stabilizes cytoplasmic beta-catenin, which stimulates the expression of genes including c-myc, c-jun, fra-1, and cyclin D1. Axin and its homologue Axil are components of the Wnt signalling pathway that negatively regulate this pathway. Other components of the Wnt signalling pathway, including Dvl, glycogen synthase kinase-3beta (GSK-3beta), beta-catenin, and adenomatous polyposis coli (APC), interact with Axin, and the phosphorylation and stability of beta-catenin are regulated in the Axin complex. Axil has similar functions to Axin. Thus, Axin and Axil act as scaffold proteins in the Wnt signalling pathway, thereby modulating the Wnt-dependent cellular functions.

    Proteins where this domain is known:
    PF13_0221   


    PS50846 - HMA_2 (Prosite link)

    Interpro entry IPR006121 : Heavy metal transport/detoxification protein (Interpro link)

    Interpro description:

    Proteins that transport heavy metals in micro-organisms and mammals share similarities in their sequences and structures.

    These proteins provide an important focus for research, some being involved in bacterial resistance to toxic metals, such as lead and cadmium, while others are involved in inherited human syndromes, such as Wilson's and Menke's diseases.

    A conserved domain has been found in a number of these heavy metal transport or detoxification proteins. The domain, which has been termed Heavy-Metal-Associated (HMA), contains two conserved cysteines that are probably involved in metal binding.

    Structure solution of the fourth HMA domain of the MenkeÂs copper transporting ATPase shows a well-defined structure comprising a four-stranded antiparallel beta-sheet and two alpha helices packed in an alpha-beta sandwich fold. This fold is common to other domains and is classified as "ferredoxin-like".

    Proteins where this domain is known:
    PFI0240c   


    PS50850 - MFS (Prosite link)

    Proteins where this domain is known:
    PFB0210c    PFB0275w    PFI0785c    PFI0955w   


    PS50857 - COX2_CUA (Prosite link)

    Interpro entry IPR002429 : Cytochrome c oxidase subunit II C-terminal (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).

    Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.

    Proteins where this domain is known:
    PF13_0327    PF14_0288   


    PS50858 - BSD (Prosite link)

    Interpro entry IPR005607 : (Interpro link)

    Interpro description:

    The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see or the U-box in multidomain proteins. The function of the BSD domain is yet unknown.

    Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain.

    Some proteins known to contain one or two BSD domains are listed below:
  • Mammalian TFIIH basal transcription factor complex p62 subunit (GTF2H1).
  • Yeast RNA polymerase II transcription factor B 73 kDa subunit (TFB1), the homologue of BTF2.
  • Yeast DOS2 protein. It is involved in single-copy DNA replication and ubiquitination.
  • Drosophila synapse-associated protein SAP47.
  • Mammalian SYAP1.
  • Various Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins.
  • Proteins where this domain is known:
    PFC1055w    PFD1095w    PFI0730w   


    PS50859 - LONGIN (Prosite link)

    Interpro entry IPR010908 : Longin (Interpro link)

    Interpro description:

    VAMPs (and its homologue synaptobrevins) define a group of SNARE proteins that contain a C-terminal coiled-coil/SNARE domain, in combination with variable N-terminal domains that are used to classify VAMPs: those containing longin N-terminal domains (~150 aa) are referred to as longins, while those with shorter N-termini are referred to as brevins. Longins are the only type of VAMP protein found in all eukaryotes, suggesting that their longin domain is essential. The longin domain is thought to exert a regulatory function. Longin domains have been shown to share the same structural fold, a profilin-like globular domain consisting of a five-stranded antiparallel beta-sheet that is sandwiched by an alpha-helix on one side, and two alpha-helices on the other (beta(2)-alpha-beta(3)-alpha(2)).

    Proteins where this domain is known:
    MAL13P1.135    MAL13P1.16    PFC0890w    PFI0515w   


    PS50860 - AA_TRNA_LIGASE_II_ALA (Prosite link)

    Interpro entry IPR018165 : Alanyl-tRNA synthetase, class IIc, conserved region (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Alanyl-tRNA synthetase is an alpha4 tetramer that belongs to class IIc.

    Proteins where this domain is known:
    PF13_0354   


    PS50862 - AA_TRNA_LIGASE_II (Prosite link)

    Interpro entry IPR006195 : Aminoacyl-tRNA synthetase, class II, conserved region (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    This entry recognises all class-II enzymes except for heterodimeric glycyl-tRNA synthetasesand alanyl- tRNA synthetases.

    Proteins where this domain is known:
    PF07_0073    PF11_0270    PF13_0262    PF14_0166    PF14_0198    PF14_0428    PFA0145c    PFA0480w    PFB0525w    PFE0475w    PFE0715w    PFF0180w    PFI1240c    PFL0670c    PFL0770w   


    PS50865 - ZF_MYND_2 (Prosite link)

    Interpro entry IPR002893 : Zinc finger, MYND-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents MYND-type zinc finger domains. The MYND domain (myeloid, Nervy, and DEAF-1) is present in a large group of proteins that includes RP-8 (PDCD2), Nervy, and predicted proteins from Drosophila, mammals, Caenorhabditis elegans, yeast, and plants. The MYND domain consists of a cluster of cysteine and histidine residues, arranged with an invariant spacing to form a potential zinc-binding motif. Mutating conserved cysteine residues in the DEAF-1 MYND domain does not abolish DNA binding, which suggests that the MYND domain might be involved in protein-protein interactions. Indeed, the MYND domain of ETO/MTG8 interacts directly with the N-CoR and SMRT co-repressors. Aberrant recruitment of co-repressor complexes and inappropriate transcriptional repression is believed to be a general mechanism of leukemogenesis caused by the t(8;21) translocations that fuse ETO with the acute myelogenous leukemia 1 (AML1) protein. ETO has been shown to be a co-repressor recruited by the promyelocytic leukemia zinc finger (PLZF) protein. A divergent MYND domain present in the adenovirus E1A binding protein BS69 was also shown to interact with N-CoR and mediate transcriptional repression. The current evidence suggests that the MYND motif in mammalian proteins constitutes a protein-protein interaction domain that functions as a co-repressor-recruiting interface.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF13_0293    PFF0105w    PFF0350w   


    PS50866 - GOLD (Prosite link)

    Interpro entry IPR009038 : (Interpro link)

    Interpro description:

    The GOLD (for Golgi dynamics) domain is a protein module found in several eukaryotic Golgi and lipid-traffic proteins. It is typically between 90 and 150 amino acids long. Most of the size difference observed in the GOLD-domain superfamily is traceable to a single large low-complexity insert that is seen in some versions of the domain. With the exception of the p24 proteins, which have a simple architecture with the GOLD domain as their only globular domain, all other GOLD-domain proteins contain additional conserved globular domains. In these proteins, the GOLD domain co-occurs with lipid-, sterol- or fatty acid-binding domains such as PH, CRAL-TRIO, FYVE oxysterol binding- and acyl CoA-binding domains, suggesting that these proteins may interact with membranes. The GOLD domain can also be found associated with a RUN domain, which may have a role in the interaction of various proteins with cytoskeletal filaments. The GOLD domain is predicted to mediate diverse protein-protein interactions. A secondary structure prediction for the GOLD domain reveals that it is likely to adopt a compact all-beta-fold structure with six to seven strands. Most of the sequence conservation is centred on the hydrophobic cores that support these predicted strands. The predicted secondary-structure elements and the size of the conserved core of the domain suggests that it may form a beta- sandwich fold with the strands arranged in two beta sheets stacked on each other.

    Some proteins known to contain a GOLD domain are listed below:
  • Eukaryotic proteins of the p24 family.
  • Animal Sec14-like proteins. They are involved in secretion.
  • Human Golgi resident protein GCP60. It interacts with the Golgi integral membrane protein Giantin.
  • Yeast oxysterol-binding protein homolog 3 (OSH3).
  • Proteins where this domain is known:
    PFE1340w   


    PS50868 - POST_SET (Prosite link)

    Interpro entry IPR003616 : (Interpro link)

    Interpro description:

    This region is found in a number of histone lysine methyltransferases (HMTase), C-terminal to the SET domain; it is generally described as the post-SET domain.

    Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.

    The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.

    The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.

    Proteins where this domain is known:
    PF08_0012    PFF1440w   


    PS50878 - RT_POL (Prosite link)

    Interpro entry IPR000477 : RNA-directed DNA polymerase (reverse transcriptase) (Interpro link)

    Interpro description:
    The use of an RNA template to produce DNA, for integration into the host genome and exploitation of a host cell, is a strategy employed in the replication of retroid elements, such as the retroviruses and bacterial retrons. The enzyme catalysing polymerisation is an RNA-directed DNA-polymerase, or reverse trancriptase (RT). Reverse transcriptase occurs in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses.

    Retroviral reverse transcriptase is synthesised as part of the POL polyprotein that contains; an aspartyl protease, a reverse transcriptase, RNase H and integrase. POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. The discovery of retroelements in the prokaryotes raises intriguing questions concerning their roles in bacteria and the origin and evolution of reverse transcriptases and whether the bacterial reverse transcriptases are older than eukaryotic reverse transcriptases.

    Proteins where this domain is known:
    PF13_0080   


    PS50881 - S5_DSRBD (Prosite link)

    Interpro entry IPR013810 : Ribosomal protein S5, N-terminal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S5 is one of the proteins from the small ribosomal subunit, and is a protein of 166 to 254 amino-acid residues. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, cyanelle, red algal chloroplast, archaeal and fungal mitochondrial S5; mammalian, Caenorhabditis elegans, Drosophila and plant S2; and yeast S4 (SUP44).

    This entry represents the N-terminal domain of ribosomal protein S5, which has an alpha-beta(3)-alpha structure that folds into two layers, alpha/beta.

    Proteins where this domain is known:
    PF14_0448   


    PS50882 - YTH (Prosite link)

    Interpro entry IPR007275 : (Interpro link)

    Interpro description:
    This family of poorly characterised proteins containsYT521-B, a putative splicing factor from rat. YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68 kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner.

    Proteins where this domain is known:
    PF14_0193    PFC0410w   


    PS50886 - TRBD (Prosite link)

    Interpro entry IPR002547 : tRNA-binding region (Interpro link)

    Interpro description:
    This domain is found in prokaryotic methionyl-tRNA synthetases, prokaryotic phenylalanyl tRNA synthetases the yeast GU4 nucleic-binding protein (G4p1 or p42, ARC1), human tyrosyl-tRNA synthetase, and endothelial-monocyte activating polypeptide II. G4p1 binds specifically to tRNA form a complex with methionyl-tRNA synthetases. In human tyrosyl-tRNA synthetase this domain may direct tRNA to the active site of the enzyme. This domain may perform a common function in tRNA aminoacylation.

    Proteins where this domain is known:
    PF14_0401   


    PS50889 - S4 (Prosite link)

    Interpro entry IPR002942 : RNA-binding S4 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterised, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.

    Proteins where this domain is known:
    MAL8P1.59    PF11_0065    PF11_0181    PF14_0584    PFB0890c    PFE1005w    PFI0685w    PFL1350w    PFL1380w   


    PS50890 - PUA (Prosite link)

    Interpro entry IPR002478 : PUA (Interpro link)

    Interpro description:

    The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases.

    Proteins where this domain is known:
    PF14_0174    PF14_0481    PF14_0635    PFE1470w   


    PS50892 - V_SNARE (Prosite link)

    Interpro entry IPR001388 : Synaptobrevin (Interpro link)

    Interpro description:

    Synaptobrevin is an intrinsic membrane protein of small synaptic vesicles, specialised secretory organelles of neurons that actively accumulate neurotransmitters and participate in their calcium-dependent release by exocytosis. Vesicle function is mediated by proteins in their membranes, although the precise nature of the protein-protein interactions underlying this are still uncertain. Synaptobrevin may play a role in the molecular events underlying neurotransmitter release and vesicle recycling and may be involved in the regulation of membrane flow in the nerve terminal, a process mediated by interaction with low molecular weight GTP-binding proteins. Synaptic vesicle-associated membrane proteins (VAMPs) from Torpedo californica (Pacific electric ray) and SNC1 from yeast are related to synaptobrevin.

    Proteins where this domain is known:
    MAL13P1.135    MAL13P1.16    MAL8P1.21    PFC0890w   


    PS50893 - ABC_TRANSPORTER_2 (Prosite link)

    Interpro entry IPR003439 : ABC transporter-like (Interpro link)

    Interpro description:

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.

    The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    On the basis of sequence similarities a family of related ATP-binding proteins has been characterised.

    The proteins belonging to this family also contain one or two copies of the 'A' consensus sequence or the 'P-loop'.

    Proteins where this domain is known:
    MAL13P1.344    PF08_0078    PF11_0225    PF11_0466    PF13_0218    PF13_0271    PF14_0133    PF14_0244    PF14_0455    PFA0590w    PFC0125w    PFC0875w    PFE1150w    PFL0495c    PFL1410c   


    PS50896 - LISH (Prosite link)

    Interpro entry IPR006594 : (Interpro link)

    Interpro description:

    The LisH motif is found in a large number of eukaryotic proteins, from metazoa, fungi and plants that have a wide range of functions. The recently solved structure of the LisH domain in the N-terminal region of LIS1 depicted it as a novel dimerization motif, and that other structural elements are likely to play an important role in dimerisation.

    A sequence motif, LisH, has been identified in the products of genes mutated in Miller-Dieker lissencephaly, Treacher Collins, oral-facial-digital type 1 and contiguous syndrome ocular albinism with late onset sensorineural deafness syndromes. An additional homologous motif was detected in a gene product fused to the fibroblast growth factor receptor type 1 in patients with an atypical stem cell myeloproliferative disorder. In total, over 100 eukaryotic intracellular proteins are shown to possess a LIS1 homology (LisH) motif, including several katanin p60 subunits, muskelin, tonneau, LEUNIG, Nopp140, aimless and numerous WD repeat-containing beta-propeller proteins.

    It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The predicted secondary structure of LisH motifs, and their occurrence in homologues of Gbeta beta-propeller subunits, suggests that they are analogues of Ggamma subunits, and might associate with the periphery of beta-propeller domains.

    Proteins where this domain is known:
    MAL13P1.182    MAL13P1.54    PF13_0018    PF13_0164    PFE0930w    PFL0920c   


    PS50897 - CTLH (Prosite link)

    Interpro entry IPR006595 : (Interpro link)

    Interpro description:

    The 33-residue LIS1 homology (LisH) motif is found in eukaryotic intracellular proteins involved in microtubule dynamics, cell migration, nucleokinesis and chromosome segregation. The LisH motif is likely to possess a conserved protein-binding function and it has been proposed that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The LisH motif is found associated to other domains, such as WD-40 (see, SPRY, Kelch, AAA ATPase, RasGEF, or HEAT (see. The secondary structure of the LisH domain is predicted to be two alpha- helices.

    Some proteins known to contain a LisH motif are listed below:
  • Animal LIS1. It regulates cytoplasmic dynein function. In Homo sapiens (human) children with defects in LIS1 suffer from Miller-Dieker lissencephaly, a brain malformation that results in severe retardation, epilepsy and an early death.
  • Emericella nidulans (Aspergillus nidulans) nuclear migration protein nudF, the orthologue of LIS1.
  • Eukaryotic RanBPM, a Ran binding protein involved in microtubule nucleation.
  • Eukaryotic Nopp140, a nucleolar phosphoprotein.
  • Mammalian treacle, a nucleolar protein. In human, defects in treacle are the cause of Treacher Collins syndrome (TCS), an autosomal dominant disorder of craniofacial development.
  • Animal muskelin. It acts as a mediator of cell spreading and cytoskeletal responses to the extracellular matrix component thrombospondin 1.
  • Animal transducin beta-like 1 protein (TBL1).
  • Plant tonneau.
  • Arabidopsis thaliana LEUNIG, a putative transcriptional corepressor that regulates AGAMOUS expression during flower development.
  • Fungal aimless RasGEF.
  • Leishmania major katanin-like protein.
  • The C-terminal to LisH (CTLH) motif is a predicted alpha-helical sequence of unknown function that is found adjacent to the LisH motif in a number of these proteins but is absent in other (e.g. LIS1). The CTLH domain can also be found in the absence of the LisH motif, like in:

  • Arabidopsis thaliana (Mouse-ear cress) hypothetical protein MUD21.5.
  • Saccharomyces cerevisiae yeast protein RMD5.
  • Proteins where this domain is known:
    MAL13P1.182    PF13_0164   


    PS50902 - FLAVODOXIN_LIKE (Prosite link)

    Interpro entry IPR008254 : Flavodoxin/nitric oxide synthase (Interpro link)

    Interpro description:

    This domain is found in a number of proteins including flavodoxin and nitric-oxide synthase. Flavodoxins are electron-transfer proteins that function in various electron transport systems. They bind one FMN molecule, which serves as a redox-active prosthetic group and are functionally interchangeable with ferredoxins. They have been isolated from prokaryotes, cyanobacteria, and some eukaryotic algae. Nitric oxide synthase produces nitric oxide from L-arginie and NADPH. Nitric oxide acts as a messenger molecule in the body.

    Proteins where this domain is known:
    PF14_0478    PFI1140w   


    PS50904 - PRELI_MSF1 (Prosite link)

    Interpro entry IPR006797 : (Interpro link)

    Interpro description:

    These proteins contain a conserved region found in the yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. GFP-tagged MSF1 localizes to mitochondria and is required for wild-type respiratory growth. This region is also found in a number of other eukaryotic proteins. The PRELI/MSF1 domain is an eukaryotic protein module which occurs in stand-alone form in several proteins, including the human PRELI protein and the yeast MSF1 protein, and as an amino-terminal domain in an orthologous group of proteins typified by human SEC14L1, which is conserved in all animals. In this group of proteins, the PRELI/MSF1 domain co-occurs with the CRAL-TRIO (see and the GOLD domains (see. The PRELI/MSF1 domain is approximately 170 residues long and is predicted to assume a globular alpha + beta fold with six beta strands and four alpha helices. It has been suggested that the PRELI/MSF1 domain may have a function associated with cellular membrane.

    Proteins where this domain is known:
    PF13_0138   


    PS50908 - RWD (Prosite link)

    Interpro entry IPR006575 : (Interpro link)

    Interpro description:

    The RWD eukaryotic domain is found in RING finger and WD repeat containing proteins and DEXDc-like helicase subfamily related to the ubiquitin-conjugating enzymes domain.

    Proteins where this domain is known:
    MAL8P1.41    PF13_0297    PF14_0264   


    PS50913 - GRIP (Prosite link)

    Interpro entry IPR000237 : (Interpro link)

    Interpro description:
    The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue.

    Proteins where this domain is known:
    PFC0235w   


    PS50919 - MIR (Prosite link)

    Interpro entry IPR016093 : MIR motif (Interpro link)

    Interpro description:

    The MIR domain is named after three of the proteins in which it occurs: protein Mannosyltransferase, Inositol 1,4,5-trisphosphate receptor (IP3R) and Ryanodine receptor (RyR). MIR domains have also been found in eukaryotic stromal cell-derived factor 2 (SDF-2) and in Chlamydia trachomatis protein CT153. The MIR domain may have a ligand transferase function. This domain has a closed beta-barrel structure with a hairpin triplet, and has an internal pseudo-threefold symmetry. The MIR motifs that make up the MIR domain consist of ~50 residues and are often found in multiple copies.

    Inositol 1,4,5-trisphosphate (InsP3) is an intracellular second messenger that transduces growth factor and neurotransmitter signals. InsP3 mediates the release of Ca2+ from intracellular stores by binding to specific Ca2+ channel-coupled receptors. Ryanodine receptors are involved in communication between transverse-tubules and the sarcoplamic reticulum of cardiac and skeletal muscle. The proteins function as a Ca2+-release channels following depolarisation of transverse-tubules. The function is modulated by Ca2+, Mg2+, ATP and calmodulin. Deficiency in the ryanodine receptor may be the cause of malignant hyperthermia (MH) and of central core disease of muscle (CCD). protein O-mannosyltransferases transfer mannose from DOL-P-mannose to ser or thr residues on proteins.

    Proteins where this domain is known:
    PF10_0104   


    PS50920 - SOLCAR (Prosite link)

    Interpro entry IPR018108 : (Interpro link)

    Interpro description:

    A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.

    Proteins where this domain is known:
    PF08_0031    PF08_0093    PF10_0051    PF10_0366    PF13_0359    PFA0415c    PFA0435w    PFD0367w    PFI0255c    PFI0425w    PFL0110c    PFL1145w    PFL2000w   


    PS50922 - TLC (Prosite link)

    Interpro entry IPR006634 : TRAM, LAG1 and CLN8 homology (Interpro link)

    Interpro description:

    TLC is a protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis , TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. Proteins containing this domain may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.

    Proteins where this domain is known:
    PF14_0034    PFE0405c   


    PS50929 - ABC_TM1F (Prosite link)

    Interpro entry IPR017940 : (Interpro link)

    Interpro description:

    ABC transporters belong to the ATP-Binding Cassette (ABC) superfamily, which uses the hydrolysis of ATP to energise diverse biological systems. ABC transporters minimally consist of two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). These can be found on the same protein or on two different ones. Most ABC transporters function as a dimer and therefore are constituted of four domains, two ABC modules and two TMDs.

    ABC transporters are involved in the export or import of a wide variety of substrates ranging from small ions to macromolecules. The major function of ABC import systems is to provide essential nutrients to bacteria. They are found only in prokaryotes and their four constitutive domains are usually encoded by independent polypeptides (two ABC proteins and two TMD proteins). Prokaryotic importers require additional extracytoplasmic binding proteins (one or more per systems) for function. In contrast, export systems are involved in the extrusion of noxious substances, the export of extracellular toxins and the targeting of membrane components. They are found in all living organisms and in general the TMD is fused to the ABC module in a variety of combinations. Some eukaryotic exporters encode the four domains on the same polypeptide chain.

    The ABC module (approximately two hundred amino acid residues) is known to bind and hydrolyse ATP, thereby coupling transport to ATP hydrolysis in a large number of biological processes. The cassette is duplicated in several subfamilies. Its primary sequence is highly conserved, displaying a typical phosphate-binding loop: Walker A, and a magnesium binding site: Walker B. Besides these two regions, three other conserved motifs are present in the ABC cassette: the switch region which contains a histidine loop, postulated to polarise the attaching water molecule for hydrolysis, the signature conserved motif (LSGGQ) specific to the ABC transporter, and the Q-motif (between Walker A and the signature), which interacts with the gamma phosphate through a water bond. The Walker A, Walker B, Q-loop and switch region form the nucleotide binding site.

    The 3D structure of a monomeric ABC module adopts a stubby L-shape with two distinct arms. ArmI (mainly beta-strand) contains Walker A and Walker B. The important residues for ATP hydrolysis and/or binding are located in the P-loop. The ATP-binding pocket is located at the extremity of armI. The perpendicular armII contains mostly the alpha helical subdomain with the signature motif. It only seems to be required for structural integrity of the ABC module. ArmII is in direct contact with the TMD. The hinge between armI and armII contains both the histidine loop and the Q-loop, making contact with the gamma phosphate of the ATP molecule. ATP hydrolysis leads to a conformational change that could facilitate ADP release. In the dimer the two ABC cassettes contact each other through hydrophobic interactions at the antiparallel beta-sheet of armI by a two-fold axis.

    The ATP-Binding Cassette (ABC) superfamily forms one of the largest of all protein families with a diversity of physiological functions. Several studies have shown that there is a correlation between the functional characterisation and the phylogenetic classification of the ABC cassette. More than 50 subfamilies have been described based on a phylogenetic and functional classification; (for further information see http://www.tcdb.org/tcdb/index.php?tc=3.A.1).

    ABC transporters minimally contain two conserved regions: a highly conserved ATP binding cassette (ABC) and a less conserved transmembrane domain (TMD). In certain bacterial transporters, these regions are found on different polypeptides. The function of the integral inner-membrane protein is to translocate the substrate across the membrane, as well as in substrate recognition.

    This entry is a ABC transporter integral membrane type 1 fused domain.

    Proteins where this domain is known:
    PF11_0466    PF13_0218    PF13_0271    PF14_0455    PFA0590w    PFC0125w    PFE1150w    PFL0495c    PFL1410c   


    PS50934 - SWIRM (Prosite link)

    Interpro entry IPR007526 : (Interpro link)

    Interpro description:

    The SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in eukaryotic chromosomal proteins. It is named after the proteins SWI3, RSC8 and MOIRA in which it was first recognised. This domain is predicted to mediate protein-protein interactions in the assembly of chromatin-protein complexes. The SWIRM domain can be linked to different domains, such as the ZZ-type zinc finger, the Myb DNA-binding domain, the HORMA domain, the amino-oxidase domain, the chromo domain, and the JAB1/PAD1 domain.

    Proteins where this domain is known:
    PFL1215c   


    PS50935 - SSB (Prosite link)

    Interpro entry IPR000424 : Primosome PriB/single-strand DNA-binding (Interpro link)

    Interpro description:
    The Escherichia coli single-strand binding protein (gene ssb), also known as the helix-destabilizing protein, is a protein of 177 amino acids. It binds tightly, as a homotetramer, to single-stranded DNA (ss-DNA) and plays an important role in DNA replication, recombination and repair. Closely related variants of SSB are encoded in the genome of a variety of large self-transmissible plasmids. SSB has also been characterised in bacteria such as Proteus mirabilis or Serratia marcescens. Eukaryotic mitochondrial proteins that bind ss-DNA and are probably involved in mitochondrial DNA replication are structurally and evolutionary related to prokaryotic SSB.

    Proteins where this domain is known:
    PFE0435c   


    PS50942 - ENTH (Prosite link)

    Interpro entry IPR013809 : (Interpro link)

    Interpro description:

    The ENTH (Epsin N-terminal homology) domain is approximately 150 amino acids in length and is always found located at the N-termini of proteins. The domain forms a compact globular structure, composed of 9 alpha-helices connected by loops of varying length. The general topology is determined by three helical hairpins that are stacked consecutively with a right hand twist. An N-terminal helix folds back, forming a deep basic groove that forms the binding pocket for the Ins(1,4,5)P3 ligand. The ligand is coordinated by residues from surrounding alpha-helices and all three phosphates are multiply coordinated. The coordination of Ins(1,4,5)P3 suggests that ENTH is specific for particular head groups.

    Proteins containing this domain have been found to bind PtdIns(4,5)P2 and PtdIns(1,4,5)P3 suggesting that the domain may be a membrane interacting module. The main function of proteins containing this domain appears to be to act as accessory clathrin adaptors in endocytosis, Epsin is able to recruit and promote clathrin polymerisation on a lipid monolayer, but may have additional roles in signalling and actin regulation. Epsin causes a strong degree of membrane curvature and tubulation, even fragmentation of membranes with a high PtdIns(4,5)P2 content. Epsin binding to membranes facilitates their deformation by insertion of the N-terminal helix into the outer leaflet of the bilayer, pushing the head groups apart. This would reduce the energy needed to curve the membrane into a vesicle, making it easier for the clathrin cage to fix and stabilise the curved membrane. This points to a pioneering role for epsin in vesicle budding as it provides both a driving force and a link between membrane invagination and clathrin polymerisation.

    Proteins where this domain is known:
    PFL2195w   


    PS50943 - HTH_CROC1 (Prosite link)

    Interpro entry IPR001387 : Helix-turn-helix type 3 (Interpro link)

    Interpro description:

    This is large family of DNA binding helix-turn helix proteins that include a bacterial plasmid copy control protein, bacterial methylases, various bacteriophage transcription control proteins and a vegetative specific protein from Dictyostelium discoideum (Slime mould).

    Proteins where this domain is known:
    PF11_0293   


    PS50957 - JOSEPHIN (Prosite link)

    Interpro entry IPR006155 : (Interpro link)

    Interpro description:
    Human genes containing triplet repeats can markedly expand in length, leading to neuropsychiatric disease. Expansion of triplet repeats explains the phenomenon of anticipation, i.e. the increasing severity or earlier age of onset in successive generations in a pedigree. A novel gene containing CAG repeats has been identified and mapped to chromosome 14q32.1, the genetic locus for Machado-Joseph disease (MJD). Normally, the gene contains 13-36 CAG repeats, but most clinically diagnosed patients and all affected members of a family with the clinical and pathological diagnosis of MJD show expansion of the repeat number, from 68-79. Similar abnormalities in related genes may give rise to diseases similar to MJD. MJD is a neurodegenerative disorder characterised by cerebellar ataxia, pyramidal and extra-pyramidal signs, peripheral nerve palsy, external ophtalmoplegia, facial and lingual fasciculation and bulging. The disease is autosomal dominant, with late onset of symptoms, generally after the fourth decade.

    Proteins where this domain is known:
    PF11_0125    PFL1295w   


    PS50966 - ZF_SWIM (Prosite link)

    Interpro entry IPR007527 : Zinc finger, SWIM-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the SWIM (SWI2/SNF2 and MuDR) zinc-binding domain, which is found in a variety of prokaryotic and eukaryotic proteins, such as mitogen-activated protein kinase kinase kinase 1 (or MEKK1). It is also found in the related protein MEX (MEKK1-related protein X), a testis-expressed protein that acts as an E3 ubiquitin ligase through the action of E2 ubiquitin-conjugating enzymes in the proteasome degradation pathway; the SWIM domain is critical for MEX ubiquitination. SWIM domains are also found in the homologous recombination protein Sws1, as well as in several hypothetical proteins.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFC0630w   


    PS50967 - HRDC (Prosite link)

    Interpro entry IPR002121 : Helicase and RNase D C-terminal, HRDC (Interpro link)

    Interpro description:
    The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain associated with the human BLM gene result in Bloom Syndrome (BS), an autosomal recessive disorder characterised by proportionate pre- and postnatal growth deficiency; sun-sensitive, telangiectatic, hypo- and hyperpigmented skin; predisposition to malignancy; and chromosomal instability.

    Proteins where this domain is known:
    PF14_0169    PF14_0473   


    PS50968 - BIOTINYL_LIPOYL (Prosite link)

    Interpro entry IPR000089 : (Interpro link)

    Interpro description:
    The biotin / lipoyl attachment domain has a conserved lysine residue that binds biotin or lipoic acid. Biotin plays a catalytic role in some carboxyl transfer reactions and is covalently attached, via an amide bond, to a lysine residue in enzymes requiring this coenzyme. E2 acyltransferases have an essential cofactor, lipoic acid, which is covalently bound via an amide linkage to a lysine group. The lipoic acid cofactor is found in a variety of proteins that include, H-protein of the glycine cleavage system (GCS), mammalian and yeast pyruvate dehydrogenases and fast migrating protein (FMP) (gene acoC) from Ralstonia eutropha (Alcaligenes eutrophus).

    Proteins where this domain is known:
    PF10_0407    PF13_0121    PF14_0664    PFC0170c   


    PS50969 - FCP1 (Prosite link)

    Interpro entry IPR004274 : (Interpro link)

    Interpro description:
    The function of this domain is unclear. It is found in proteins of diverse function including phosphatases some of which may be active in active in ternary elongation complexes and a number of NLI interacting factors. In the phospatases this domain is often present N-terminal to the BRCT domain.

    Proteins where this domain is known:
    MAL13P1.275    PF07_0110    PF10_0124    PFE0795c   


    PS50972 - PTERIN_BINDING (Prosite link)

    Interpro entry IPR000489 : Dihydropteroate synthase, DHPS (Interpro link)

    Interpro description:

    All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. Most microorganisms must synthesize folate de novo because they lack the active transport system of higher vertebrate cells that allows these organisms to use dietary folates. Proteins containing this domain include dihydropteroate synthase as well as a group of methyltransferase enzymes including methyltetrahydrofolate, corrinoid iron-sulphur protein methyltransferase (MeTr)that catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation.

    Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).

    Proteins where this domain is known:
    PF08_0095   


    PS50975 - ATP_GRASP (Prosite link)

    Interpro entry IPR011761 : ATP-grasp fold (Interpro link)

    Interpro description:

    The ATP-grasp superfamily currently includes 17 groups of enzymes, catalyzing ATP-dependent ligation of a carboxylate containing molecule to an amino or thiol group-containing molecule. They contribute predominantly to macromolecular synthesis. ATP-hydrolysis is used to activate a substrate. For example, DD-ligase transfers phosphate from ATP to D-alanine on the first step of catalysis. On the second step the resulting acylphosphate is attacked by a second D-alanine to produce a DD dipeptide following phosphate elimination.

    The ATP-grasp domain contains three conserved motifs, corresponding to the phosphate binding loop and the Mg(2+) binding site. The fold is characterised by two alpha-beta subdomains that grasp the ATP molecule between them. Each subdomain provides a variable loop that forms part of the active site, with regions from other domains also contributing to the active site, even though these other domains are not conserved between the various ATP-grasp enzymes.

    Proteins where this domain is known:
    PF13_0044    PF14_0664   


    PS50979 - BC (Prosite link)

    Interpro entry IPR011764 : Biotin carboxylation region (Interpro link)

    Interpro description:

    Biotin-dependent carboxylase enzymes perform a two step reaction. Enzyme-bound biotin is first carboxylated by bicarbonated and ATP and the carboxyl group temporarily bound to biotin is subsequently transferred to an acceptor substrate such as pyruvate or acetyl-CoA. The first step is mediated by the BC domain common to all biotin-dependent carboxylases. The BC domain can be divided in three subdomains (N-terminal, central and C-terminal). The N-terminal region provides part of the active site; the central region corresponds to the ATP-grasp domain, which is common to many ATP-dependent enzymes involved in macromolecular synthesis. The ATP-grasp module directly binds the ATP molecule. The C-terminal subdomain is involved in dimer formation.

    Several structure of the BC domain have been solved . The central module is splayed significantly away from the main body of the domain and is able to rotate of approximately 45 degree upon nucleotide binding thereby closing off the active site pocket.

    Proteins where this domain is known:
    PF14_0664   


    PS50980 - COA_CT_NTER (Prosite link)

    Interpro entry IPR011762 : Acetyl-coenzyme A carboxyltransferase, N-terminal (Interpro link)

    Interpro description:

    Acetyl-coenzyme A carboxylase (ACC), a member of the biotin-dependent enzyme family, catalyses the formation of malonyl-coenzyme A (CoA) and regulates fatty acid biosynthesis and oxidation. Biotin-dependent carboxylase enzymes perform a two step reaction: enzyme-bound biotin is first carboxylated by bicarbonate and ATP and the carboxyl group temporarily bound to biotin is subsequently transferred to an acceptor substrate such as acetyl-CoA. The carboxyltransferase domain performs the second part of the reaction.

    The N- and C-terminal regions of the carboxyltransferase domain share similar polypeptide backbone folds, with a central beta-beta-alpha superhelix. The CoA molecule is mostly associated with the N subdomain. In bacterial acetyl coenzyme A carboxylase the N and C subdomains are encoded by two different polypeptides.

    This entry represents the N terminal subdomain and contains the bacterial ACC beta-subunit.

    Proteins where this domain is known:
    PF14_0664   


    PS50984 - TRUD (Prosite link)

    Interpro entry IPR011760 : tRNA pseudouridine synthase D, core (Interpro link)

    Interpro description:

    The most abundant modification seen in structured RNAs (transfer, ribosomal, and splicing RNAs) is the isomerization of uridine (U) to pseudouridine (5- ribosyluracil). Pseudouridine is made by a set of enzymes called pseudouridine synthase, which select specific U residues in a polynucleotide chain for isomerization to pseudouridine. Pseudouridine synthases are ubiquitous as putative synthase genes have been found in all genomes so far sequenced. TruD, a pseudouridine synthase in Escherichia coli, is responsible for modifying U 13 in tRNA-Glu to pseudouridine. Homologs of truD have been identified in eubacteria, archaea, and eukarya. Because all of the organisms known to have pseudouridine 13 in their tRNAs also have a truD homolog, it is reasonable to infer that truD homologs in those organisms with tRNA pseudouridine 13 are the responsible synthases.

    TruD folds into a V-shaped molecule with two distinct modules: a catalytic domain that differs in sequence but is structurally very similar to the catalytic domain of other pseudouridine synthases and a TRUD domain of ~150 amino acids with a alpha/beta fold. The TRUD domain forms a compact fold that is titled away from the catalytic domain to form a deep cleft in truD which is lined with basic residues from each domain. The TRUD domain is always associated with a truD-type catalytic domain and is not found on its own or attached to another type of protein as a separate module. Furthermore, there are no truD-type catalytic domain that lack the TRUD domain insert. The TRUD domain is characterised by two conserved sequence motifs that form a part of the hydrophobic core. The TRUD domain sequence in the truD family is also characterised by large insertions at several specific sites that are seen in many archaeal and eukaryotic homologs. The TRUD domain is likely to be involved in substrate recognition and may represent a RNA binding module.

    Proteins where this domain is known:
    PF07_0125    PF10_0341   


    PS50989 - COA_CT_CTER (Prosite link)

    Interpro entry IPR011763 : Acetyl-coenzyme A carboxyltransferase, C-terminal (Interpro link)

    Interpro description:

    Acetyl-coenzyme A carboxylase (ACC), a member of the biotin-dependent enzyme family, catalyses the formation of malonyl-coenzyme A (CoA) and regulates fatty acid biosynthesis and oxidation. Biotin-dependent carboxylase enzymes perform a two step reaction: enzyme-bound biotin is first carboxylated by bicarbonate and ATP and the carboxyl group temporarily bound to biotin is subsequently transferred to an acceptor substrate such as acetyl-CoA. The carboxyltransferase domain performs the second part of the reaction.

    The N- and C-terminal regions of the carboxyltransferase domain share similar polypeptide backbone folds, with a central beta-beta-alpha superhelix. The CoA molecule is mostly associated with the N subdomain. In bacterial acetyl coenzyme A carboxylase the N and C subdomains are encoded by two different polypeptides.

    Proteins where this domain is known:
    PF14_0664   


    PS51006 - SPERMIDINE_SYNTHASE_2 (Prosite link)

    Interpro entry IPR001045 : Spermine synthase (Interpro link)

    Interpro description:
    Synonym(s): Spermidine aminopropyltransferase

    A group of polyamine biosynthetic enzymes involved in the fifth (last) step in the biosynthesis of spermidine from arginine and methionine which includes; spermidine synthase, spermine synthase and putrescine N-methyltransferase.

    The Thermotoga maritima spermidine synthase monomer consists of two domains: an N-terminal domain composed of six beta-strands, and a Rossmann-like C- terminal domain. The larger C-terminal catalytic core domain consists of a seven-stranded beta-sheet flanked by nine alpha helices. This domain resembles a topology observed in a number of nucleotide and dinucleotide-binding enzymes, and in S-adenosyl-L-methionine (AdoMet)- dependent methyltransferase (MTases).

    Proteins where this domain is known:
    PF11_0301   


    PS51007 - CYTC (Prosite link)

    Interpro entry IPR009056 : Cytochrome c, monohaem (Interpro link)

    Interpro description:

    After cytochrome c is synthesized in the cytoplasm as apocytochrome c, it is transported through the outer mitochondrial membrane to the intermembrane space, where haem is covalently attached by thioester bonds to two cysteine residues located in the cytochrome c centre. Cytochrome c is required during oxidative phosphorylation as an electron shuttle between Complex III (cytochrome c reductase) and IV (cytochrome c oxidase). In addition, cytochrome c is involved in apoptosis in more complex organisms such as Xenopus, rats and humans. Cellular stress can induce cytochrome c release from the mitochondrial membrane. In mammals, cytochrome c triggers the assembly of the apoptosome, consisting of cytochrome c, Apaf-1 and dATP, which activates caspase-9, leading to cell death. There are several different members of the cytochrome c family with different functional roles, for instance cytochrome c549 is associated with photosystem II.

    The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes. The consensus sequence for the cytochrome c centre is Cys-X-X-Cys-His, where the histidine residue is one of the two axial ligands of the haem iron. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes both mono-haem proteins and multi-haem proteins. This entry represents mono-haem cytochrome c proteins (excluding class II and f-type cytochromes), such as cytochromes c, c1, c2, c5, c555, c550 to c553, c556, and c6.

    Cytochrome c-type centres are also found in the active sites of many enzymes, including cytochrome cd1-nitrite reductase as the N-terminal haem c domain, in quinoprotein alcohol dehydrogenase as the C-terminal domain, in Quinohemoprotein amine dehydrogenase A chain as domains 1 and 2, and in the cytochrome bc1 complex as the cytochrome bc1 domain.

    Proteins where this domain is known:
    MAL13P1.55    PF14_0038    PF14_0597   


    PS51011 - ARID (Prosite link)

    Interpro entry IPR001606 : AT-rich interaction region (Interpro link)

    Interpro description:

    Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.

    The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions.

    Proteins where this domain is known:
    PFF0175c   


    PS51025 - PWI (Prosite link)

    Interpro entry IPR002483 : Splicing factor PWI (Interpro link)

    Interpro description:

    The PWI domain, named after a highly conserved PWI tri-peptide located within its N-terminal region, is a ~80 amino acid module, which is found either at the N-terminus or at the C-terminus of eukaryotic proteins involved in pre-mRNA processing. It is generally found in association with other domains such as RRM and RS. The PWI domain is a RNA/DNA-binding domain that has an equal preference for single- and double-stranded nucleic acids and is likely to have multiple important functions in pre-mRNA processing. Proteins containing this domain include the SR-related nuclear matrix protein of 160 kD (SRm160) splicing and 3'-end cleavage-stimulatory factor, and the mammalian splicing factor PRP3.

    The PWI domain is a soluble, globular and independently folded domain which consists of a four-helix bundle, with structured N- and C-terminal elements.

    Proteins where this domain is known:
    PFC0465c    PFF0505c   


    PS51036 - ZF_A20 (Prosite link)

    Interpro entry IPR002653 : Zinc finger, A20-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0056   


    PS51037 - YEATS (Prosite link)

    Interpro entry IPR005033 : YEATS (Interpro link)

    Interpro description:

    Named the YEATS family, after 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', this family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity.

    Proteins where this domain is known:
    MAL8P1.131   


    PS51039 - ZF_AN1 (Prosite link)

    Interpro entry IPR000058 : Zinc finger, AN1-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 a ubiquitin-like protein in Xenopus laevis. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.

    Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16). For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway. Proteins containing an AN1-type zinc finger include:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0056    PFE0200c   


    PS51043 - DDHD (Prosite link)

    Interpro entry IPR004177 : DDHD (Interpro link)

    Interpro description:
    The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases.

    Proteins where this domain is known:
    MAL8P1.91   


    PS51044 - ZF_SP_RING (Prosite link)

    Interpro entry IPR004181 : Zinc finger, MIZ-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents MIZ-type zinc finger domains. Miz1 (Msx-interacting-zinc finger) is a zinc finger-containing protein with homology to the yeast protein, Nfi-1. Miz1 is a sequence specific DNA binding protein that can function as a positive-acting transcription factor. Miz1 binds to the homeobox protein Msx2, enhancing the specific DNA-binding ability of Msx2. Other proteins containing this domain include the human pias family (protein inhibitor of activated STAT protein).

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.302   


    PS51048 - SGS (Prosite link)

    Interpro entry IPR007699 : (Interpro link)

    Interpro description:
    This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins. Sgt1p is a highly conserved eukaryotic protein that is required for both SCF (Skp1p/Cdc53p-Cullin-F-box)-mediated ubiquitination and kinetochore function in yeast and also plays a role in the cAMP pathway. Calcyclin (S100A6) is a member of the S100A family of calcium binding proteins and appears to play a role in cell proliferation.

    Proteins where this domain is known:
    PFI1610c    PFL1845c   


    PS51050 - ZF_CW (Prosite link)

    Interpro entry IPR011124 : Zinc finger, CW-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents a CW-type zinc finger motif, named for its conserved cysteine and tryptophan residues. It is predicted to be a highly specialised mononuclear four-cysteine (C4) zinc finger that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including chromatin methylation status and early embryonic development. Weak homology to members offurther evidences these predictions. The domain is found exclusively in vertebrates, vertebrate-infecting parasites and higher plants.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFD0970c   


    PS51069 - GBP (Prosite link)

    Interpro entry IPR003681 : (Interpro link)

    Interpro description:

    The glycophorin-binding protein contains a tandem repeat. The repeated sequence determines the binding domain for an erythrocyte receptor binding protein of Plasmodium falciparum, the malarial parasite. Erythrocyte invasion by the malarial merozoite is a receptor-mediated process, an obligatory step in the development of the parasite. The P. falciparum protein binds to the erythrocyte receptor glycophorin.

    Proteins where this domain is known:
    PF10_0159    PF13_0010    PF14_0010   


    PS51072 - MHD (Prosite link)

    Interpro entry IPR008968 : Clathrin adaptor, mu subunit, C-terminal (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents the C-terminal domain of the mu subunit from various clathrin adaptors (AP1, AP2 and AP3). The C-teminal domain has an immunoglobulin-like beta-sandwich fold consisting of 9 strands in 2 sheets with a Greek key topology, similar to that found in cytochrome f and certain transcription factors. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0202    PF11_0359    PF13_0062    PF14_0386    PFL0885w   


    PS51074 - ZF_DPH (Prosite link)

    Interpro entry IPR007872 : (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents a probable zinc binding motif that contains four cysteines and may chelate zinc, known as the DPH-type after the diphthamide (DPH) biosynthesis protein in which it was first characterised, including the proteins DPH3 and DPH4. This domain is also found associated with N-terminal domain of heat shock protein DnaJdomain.

    Diphthamide is a unique post-translationally modified histidine residue found only in translation elongation factor 2 (eEF-2). It is conserved from archaea to humans and serves as the target for diphteria toxin and Pseudomonas exotoxin A. These two toxins catalyse the transfer of ADP-ribose to diphtamide on eEF-2, thus inactivating eEF-2, halting cellular protein synthesis, and causing cell death. The biosynthesis of diphtamide is dependent on at least five proteins, DPH1 to -5, and a still unidentified amidating enzyme. DPH3 and DPH4 share a conserved region, which encode a putative zinc finger, the DPH-type or CSL-type (after the conserved motif of the final cysteine) zinc finger. The function of this motif is unknown.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFE0135w    PFL0860c   


    PS51083 - ZF_HIT (Prosite link)

    Interpro entry IPR007529 : (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the HIT-type zinc finger, which contains 7 conserved cysteines and one histidine that can potentially coordinate two zinc atoms. It has been named after the first protein that originally defined the domain: the yeast HIT1 protein. The HIT-type zinc finger displays some sequence similarities to the MYND-type zinc finger. The function of this domain is unknown but it is mainly found in nuclear proteins involved in gene regulation and chromatin remodeling. This domain is also found in the thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL8P1.141    PFI0590c    PFI0825w   


    PS51084 - HIT_2 (Prosite link)

    Interpro entry IPR001310 : (Interpro link)

    Interpro description:

    The Histidine Triad (HIT) motif, His-phi-His-phi-His-phi-phi (phi, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HIT superfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles. Hint homologues including rabbit Hint and yeast Hnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulators of Cdk7/Kin28 in vivo. Fhit homologues are diadenosine polyphosphate hydrolases and function as tumour suppressors in human and mouse though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis. The third branch of the HIT superfamily, which includes GalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates rather than hydrolysing them.

    Proteins where this domain is known:
    PF08_0059    PF14_0349   


    PS51085 - 2FE2S_FER_2 (Prosite link)

    Interpro entry IPR001041 : Ferredoxin (Interpro link)

    Interpro description:

    The ferredoxin protein family are electron carrier proteins with an iron-sulphur cofactor that act in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s) and according to sequence similarities.

    This entry represents members of the 2Fe-2S ferredoxin family that have a general core structure consisting of beta(2)-alpha-beta(2), which includes putidaredoxin and terpredoxin, and adrenodoxin. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes and in multidomain proteins, such as aldehyde oxidoreductase (N-terminal), xanthine oxidase (N-terminal), phthalate dioxygenase reductase (C-terminal), succinate dehydrogenase iron-sulphur protein (N-terminal), and methane monooxygenase reductase (N-terminal).

    Proteins where this domain is known:
    MAL13P1.95    PFL0630w    PFL0705c   


    PS51112 - AMMECR1 (Prosite link)

    Interpro entry IPR002733 : (Interpro link)

    Interpro description:

    The contiguous gene deletion syndrome is characterised by Alport syndrome (A), mental retardation (M), midface hypoplasia (M), and elliptocytosis (E), as well as generalized hypoplasia and cardiac abnormalities. It is caused by a deletion in Xq22.3, comprising several genes including AMME chromosomal region gene 1 (AMMECR1), which encodes a protein with a nuclear location and presently unknown function. The C-terminal region of AMMECR1 (from residue 122 to 333) is well conserved, and homologues appear in species ranging from bacteria and archaea to eukaryotes. The high level of conservation of the AMMECR1 domain points to a basic cellular function, potentially in either the transcription, replication, repair or translation machinery.

    The AMMECR1 domain contains a 6-amino-acid motif (LRGCIG) that might be functionally important since it is strikingly conserved throughout evolution. The AMMECR1 domain consists of two distinct subdomains of different sizes. The large subdomain, which contains both the N- and C-terminal regions, consists of five alpha-helices and five beta-strands. These five beta-strands form an antiparallel beta-sheet. The small subdomain consists of four alpha-helices and three beta-strands, and these beta-strands also form an antiparallel beta-sheet. The conserved 'LRGCIG' motif is located at beta(2) and its N-terminal loop, and most of the side chains of these residues point toward the interface of the two subdomains. The two subdomains are connected by only two loops, and the interaction between the two subdomains is not strong. Thus, these subdomains may move dynamically when the substrate enters the cleft. The size of the cleft suggests that the substrate is large, e.g., the substrate may be a nucleic acid or protein. However, the inner side of the cleft is not filled with positively charged residues, and therefore it is unlikely that negatively charged nucleic acids such as DNA or RNA interact at this site.

    Proteins where this domain is known:
    MAL13P1.172   


    PS51129 - PDXS_SNZ_2 (Prosite link)

    Interpro entry IPR001852 : (Interpro link)

    Interpro description:

    Snz1p is a highly conserved protein involved in growth arrest in Saccharomyces cerevisiae (Baker's yeast). Sor1 (singlet oxygen resistance) is essential in pyridoxine (vitamin B6) synthesis in Cercospora nicotianae and Aspergillus flavus. Pyridoxine quenches singlet oxygen at a rate comparable to that of vitamins C and E, two of the most highly efficient biological antioxidants, suggesting a previously unknown role for pyridoxine in active oxygen resistance..

    Proteins where this domain is known:
    PFF1025c   


    PS51130 - PDXT_SNO_2 (Prosite link)

    Interpro entry IPR002161 : (Interpro link)

    Interpro description:

    Members of this family are involved in the pyridoxine biosynthetic pathway. The regulation of cellular growth and proliferation in response to environmental cues is critical for development and the maintenance of viability in all organisms. In unicellular organisms, such as the budding yeast Saccharomyces cerevisiae (Baker's yeast), growth and proliferation are regulated by nutrient availability.

    Proteins where this domain is known:
    PF11_0169   


    PS51133 - ZF_TFIIS_2 (Prosite link)

    Interpro entry IPR001222 : Zinc finger, TFIIS-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon characterised by a three-stranded antiparallel beta-sheet and two beta-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site.

    Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF07_0057    PFA0505c    PFB0290c    PFD0360w   


    PS51134 - ZF_TFIIB (Prosite link)

    Interpro entry IPR013137 : Zinc finger, TFIIB-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents a zinc finger motif found in transcription factor IIB (TFIIB). In eukaryotes the initiation of transcription of protein encoding genes by the polymerase II complexe (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least seven different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, and -IIH.

    TFIIB and TFIID are responsible for promoter recognition and interaction with pol II; together with Pol II, they form a minimal initiation complex capable of transcription under certain conditions. The TATA box of a Pol II promoter is bound in the initiation complex by the TBP subunit of TFIID, which bends the DNA around the C-terminal domain of TFIIB whereas the N-terminal zinc finger of TFIIB interacts with Pol II.

    The TFIIB zinc finger adopts a zinc ribbon fold characterised by two beta-hairpins forming two structurally similar zinc-binding sub-sites. The zinc finger contacts the rbp1 subunit of Pol II through its dock domain, a conserved region of about 70 amino acids located close to the polymerase active site. In the Pol II complex this surface is located near the RNA exit groove. Interestingly this sequence is best conserved in the three polymerases that utilise a TFIIB-like general transcription factor (Pol II, Pol III, and archaeal RNA polymerase) but not in Pol I.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF14_0469    PFA0525w   


    PS51143 - MT_A70 (Prosite link)

    Interpro entry IPR007757 : MT-A70 (Interpro link)

    Interpro description:
    MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs.

    Proteins where this domain is known:
    PF07_0123    PFL1715w   


    PS51144 - ALPHA_CA_2 (Prosite link)

    Interpro entry IPR001148 : (Interpro link)

    Interpro description:

    Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.

    This entry represents alpha class carbonic anhydrases.

    More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.

    Proteins where this domain is known:
    PF11_0411   


    PS51147 - PFTA (Prosite link)

    Interpro entry IPR002088 : Protein prenyltransferase, alpha subunit (Interpro link)

    Interpro description:

    Protein prenylation is the posttranslational attachment of either a farnesyl group or a geranylgeranyl group via a thioether linkage (-C-S-C-) to a cysteine at or near the carboxyl terminus of the protein. Farnesyl and geranylgeranyl groups are polyisoprenes, unsaturated hydrocarbons with a multiple of five carbons; the chain is 15 carbons long in the farnesyl moiety and 20 carbons long in the geranylgeranyl moiety. There are three different protein prenyltransferases in humans: farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) share the same motif (the CaaX box) around the cysteine in their substrates, and are thus called CaaX prenyltransferases, whereas geranylgeranyltransferase 2 (GGT2, also called Rab geranylgeranyltransferase) recognises a different motif and is thus called a non-CaaX prenyltransferase. Protein prenyltransferases are currently known only in eukaryotes, but they are widespread, being found in vertebrates, insects, nematodes, plants, fungi and protozoa, including several parasites.

    Each protein consists of two subunits, alpha and beta; the alpha subunit of FT and GGT1 is encoded by the same gene, FNTA. The alpha subunit is thought to participate in a stable complex with the isoprenyl substrate; the beta subunit binds the peptide substrate. In the alpha subunits of both types of protein prenyltransferases, seven tetratricopeptide repeats are formed by pairs of helices that are stabilized by conserved intercalating residues. The alpha subunits of GGT2 in mammals and plants also have an immunoglobulin-like domain between the fifth and sixth tetratricopeptide repeat, as well as leucine-rich repeats at the carboxyl terminus. The functions of these additional domains in GGT2 are as yet undefined, but they are apparently not directly involved in the interaction with substrates and Rab escort proteins. The tetratricopeptide repeats of the alpha subunit form a right-handed superhelix, which embraces the (alpha-alpha)6 barrel of the beta subunit.

    Proteins where this domain is known:
    PF14_0403    PFL2050w   


    PS51151 - NAC_AB (Prosite link)

    Interpro entry IPR002715 : (Interpro link)

    Interpro description:

    Nascent polypeptide-associated complex (NAC) is among the first ribosome-associated entities to bind the nascent polypeptide after peptide bond formation. The nascent polypeptide-associated complex (NAC) of yeast functions in the targeting process of ribosomes to the ER membrane. NAC may prevent binding of ribosome nascent chains (RNCs) without a signal sequence to yeast membranes.

    Proteins where this domain is known:
    PF14_0241    PFF1050w   


    PS51154 - MACRO (Prosite link)

    Interpro entry IPR002589 : (Interpro link)

    Interpro description:

    The Macro or A1pp domain is a module of about 180 amino acids which can bind ADP-ribose, an NAD metabolite or related ligands. The domain was described originally in association with ADP-ribose 1''-phosphate (Appr-1''-P) processing activity (A1pp) of the yeast YBR022W protein. The domain is also called Macro domain as it is the C-terminal domain of mammalian core histone macro-H2A. Macro domain proteins can be found in eukaryotes, in (mostly pathogenic) bacteria, in archaea and in ssRNA viruses, such as coronaviruses, Rubella and Hepatitis E viruses. In vertebrates the domain occurs e.g. in histone macroH2A, in predicted poly-ADP-ribose polymerases (PARPs) and in B aggressive lymphoma (BAL) protein. The macro domain can be associated with catalytic domains, such as PARP, or sirtuin. The Macro domain can recognize ADP-ribose or in some cases poly-ADP-ribose, which can be involved in ADP-ribosylation reactions that occur in important processes, such as chromatin biology, DNA repair and transcription regulation. The human macroH2A1.1 Macro domain binds an NAD metabolite O-acetyl-ADP-ribose. The Macro domain has been suggested to play a regulatory role in ADP-ribosylation, which is involved in inter- and intracellular signaling, transcriptional regulation, DNA repair pathways and maintenance of genomic stability, telomere dynamics, cell differentiation and proliferation, and necrosis and apoptosis.

    The 3D structure of the Macro domain has a mixed alpha/beta fold of a mixed beta sheet sandwiched between four helices. Several Macro domain only domains are shorter than the structure of AF1521 and lack either the first strand or the C-terminal helix 5. Well conserved residues form a hydrophobic cleft and cluster around the AF1521-ADP-ribose binding site.

    Proteins where this domain is known:
    MAL13P1.74    MAL7P1.83    PF14_0466   


    PS51156 - ELM2 (Prosite link)

    Interpro entry IPR000949 : (Interpro link)

    Interpro description:

    The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain and a GATA binding domain. ELM2, in some instances, is also found associated with the ARID DNA binding domain This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain.

    Proteins where this domain is known:
    PF11_0429    PFE0995c   


    PS51157 - ZF_UBR (Prosite link)

    Interpro entry IPR003126 : Zinc finger, N-recognin (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    The N-end rule-based degradation signal, which targets a protein for ubiquitin-dependent proteolysis, comprises a destabilizing amino-terminal residue and a specific internal lysine residue. This entry describes a putative zinc finger in N-recognin, a recognition component of the N-end rule pathway.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFL1620w   


    PS51160 - ACYLPHOSPHATASE_3 (Prosite link)

    Interpro entry IPR001792 : Acylphosphatase (Interpro link)

    Interpro description:

    Acylphosphatase is an enzyme of approximately 98 amino acid residues that specifically catalyses the hydrolysis of the carboxyl-phosphate bond of acylphosphates, its substrates including 1,3-diphosphoglycerate and carbamyl phosphate. The enzyme has a mainly beta-sheet structure with 2 short alpha-helical segments. It is distributed in a tissue-specific manner in a wide variety of species, although its physiological role is as yet unknown: it may, however, play a part in the regulation of the glycolytic pathway and pyrimidine biosynthesis. There are two known isozymes. One seems to be specific to muscular tissues, the other, called 'organ-common type', is found in many different tissues. While bacterial and archebacterial hypothetical proteins that are highly similar to that enzyme and that probably possess the same activity.

    These proteins include:

    Proteins where this domain is known:
    PFF0255c   


    PS51161 - ATP_CONE (Prosite link)

    Interpro entry IPR005144 : (Interpro link)

    Interpro description:

    The ATP-cone is an evolutionarily mobile, ATP-binding regulatory domain which is found in a variety of proteins including ribonucleotide reductases, phosphoglycerate kinases and transcriptional regulators.

    In ribonucleotide reductase protein R1 from Escherichia coli this domain is located at the N-terminus, and is composed mostly of helices. It forms part of the allosteric effector region and contains the general allosteric activity site in a cleft located at the tip of the N-terminal region. This site binds either ATP (activating) or dATP (inhibitory), with the base bound in a hydrophobic pocket and the phosphates bound to basic residues. Substrate binding to this site is thought to affect enzyme activity by altering the relative positions of the two subunits of ribonucleotide reductase.

    Proteins where this domain is known:
    PF14_0352   


    PS51163 - YRDC (Prosite link)

    Interpro entry IPR006070 : (Interpro link)

    Interpro description:

    The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. The YrdC protein, which consists solely of this domain, forms an alpha/beta twisted open-sheet structure composed of seven alpha helices and seven beta strands. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.

    Proteins where this domain is known:
    PFL0175c   


    PS51183 - JMJN (Prosite link)

    Interpro entry IPR003349 : (Interpro link)

    Interpro description:

    Jumonji protein is required for neural tube formation in mice.There is evidence of domain swapping within the jumonji family of transcription factors. This domain is often associated with JmjC.

    Proteins where this domain is known:
    MAL8P1.111   


    PS51184 - JMJC (Prosite link)

    Interpro entry IPR003347 : (Interpro link)

    Interpro description:

    This entry contains:

    Proteins where this domain is known:
    MAL8P1.111    PFF0135w   


    PS51186 - GNAT (Prosite link)

    Interpro entry IPR000182 : GCN5-related N-acetyltransferase (Interpro link)

    Interpro description:

    Histone acetylation is carried out by a class of enzymes known as histone acetyltransferases (HATs), which catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histone. Early indication that HATs were involved in transcription came from the observation that in actively transcribed regions of chromatin, histones tend to be hyperacetylated, whereas in transcriptionally silent regions histones are hypoacetylated. The histone acetyltransferases are divided into five families. These include the Gcn5-related acetyltransferases (GNATs); the MYST (for 'MOZ, Ybf2/Sas3, Sas2 and Tip60)-related HATs; p300/CBP HATs; the general transcription factor HATs, which include the TFIID subunit TAF250; and the nuclear hormone-related HATs SRC1 and ACTR (SRC3). The GCN5-related N-acetyltransferase superfamily includes such enzymes as the histone acetyltransferases GCN5 and Hat1, the elongator complex subunit Elp3, the mediator-complex subunit Nut1, and Hpa2 .

    Many GNATs share several functional domains, including an N-terminal region of variable length, an acetyltransferase domain that encompasses the conserved sequence motifs described above, a region that interacts with the coactivator Ada2, and a C-terminal bromodomain that is believed to interact with acetyl-lysine residues. Members of the GNAT family are important for the regulation of cell growth and development. In mice, knockouts of Gcn5L are embryonic lethal. Yeast Gcn5 is needed for normal progression through the G2ÂM boundary and mitotic gene expression. The importance of GNATs is probably related to their role in transcription and DNA repair.

    The yeast GCN5 (yGCN5) transcriptional coactivator functions as a histone acetyltransferase (HAT) to promote transcriptional activation. The crystal structure of the yeast histone acetyltransferase Hat1-acetyl coenzyme A (AcCoA) shows that Hat1 has an elongated, curved structure, and the AcCoA molecule is bound in a cleft on the concave surface of the protein, marking the active site of the enzyme. A channel of variable width and depth that runs across the protein is probably the binding site for the histone substrate. The central protein core associated with AcCoA binding that appears to be structurally conserved among a superfamily of N-acetyltransferases, including yeast histone acetyltransferase 1 and Serratia marcescens aminoglycoside 3-N-acetyltransferase.

    Proteins where this domain is known:
    MAL8P1.200    PF08_0034    PF10_0036    PF14_0350    PFA0465c   


    PS51188 - ZF_CR (Prosite link)

    Interpro entry IPR001305 : Heat shock protein DnaJ, cysteine-rich region (Interpro link)

    Interpro description:

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolyzing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.

    Proteins where this domain is known:
    PF14_0359    PFD0462w   


    PS51192 - HELICASE_ATP_BIND_1 (Prosite link)

    Interpro entry IPR014021 : (Interpro link)

    Interpro description:

    Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs has been identified . These two superfamilies encompass a large number of DNA and RNA helicases from archaea, eubacteria, eukaryotes and viruses that seem to be active as monomers or dimers. RNA and DNA helicases are considered to be enzymes that catalyse the separation of double-stranded nucleic acids in an energy-dependent manner.

    The various structures of SF1 and SF2 helicases present a common core with two alpha-beta RecA-like domains . The structural homology with the RecA recombination protein covers the five contiguous parallel beta strands and the tandem alpha helices. ATP binds to the amino proximal alpha-beta domain, where the Walker A (motif I) and Walker B (motif II) are found. The N-terminal domain also contains motif III (S-A-T) which was proposed to participate in linking ATPase and helicase activities. The carboxy-terminal alpha-beta domain is structurally very similar to the proximal one even though it is bereft of an ATP-binding site, suggesting that it may have originally arisen through gene duplication of the first one.

    Some members of helicase superfamilies 1 and 2 are listed below:

    This entry represents the ATP-binding domain found within most SF1 and SF2 helicases.

    Proteins where this domain is known:
    MAL13P1.14    MAL13P1.166    MAL13P1.216    MAL13P1.322    MAL7P1.113    MAL8P1.19    MAL8P1.65    PF08_0042    PF08_0048    PF08_0096    PF08_0111    PF08_0126    PF10_0209    PF10_0232    PF10_0294    PF10_0309    PF10_0369    PF11_0053    PF13_0037    PF13_0077    PF13_0177    PF13_0308    PF14_0183    PF14_0185    PF14_0234    PF14_0278    PF14_0370    PF14_0429    PF14_0436    PF14_0563    PF14_0655    PFB0445c    PFB0730w    PFB0860c    PFC0440c    PFC0915w    PFC0955w    PFD0245c    PFD0565c    PFD1060w    PFD1070w    PFE0205w    PFE0215w    PFE0430w    PFE0925c    PFE1085w    PFE1390w    PFF0100w    PFF0225w    PFF1185w    PFF1500c    PFI0165c    PFI0480w    PFI0860c    PFI0910w    PFL0100c    PFL1310c    PFL1525c    PFL2010c    PFL2440w    PFL2475w   


    PS51193 - HELICASE_ATP_BIND_2 (Prosite link)

    Interpro entry IPR014013 : (Interpro link)

    Interpro description:

    Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs has been identified. These two superfamilies encompass a large number of DNA and RNA helicases from archaea, eubacteria, eukaryotes and viruses that seem to be active as monomers or dimers. RNA and DNA helicases are considered to be enzymes that catalyze the separation of double-stranded nucleic acids in an energy-dependent manner.

    The various structures of SF1 and SF2 helicases present a common core with two alpha-beta RecA-like domains. The structural homology with the RecA recombination protein covers the five contiguous parallel beta strands and the tandem alpha helices. ATP binds to the amino proximal alpha-beta domain, where the Walker A (motif I) and Walker B (motif II) are found. The N-terminal domain also contains motif III (S-A-T) which was proposed to participate in linking ATPase and helicase activities. The carboxy-terminal alpha-beta domain is structurally very similar to the proximal one even though it is bereft of an ATP-binding site, suggesting that it may have originally arisen through gene duplication of the first one.

    Some members of helicase superfamilies 1 and 2 are listed below:

    This entry represents the ATP-binding domain found within bacterial DinG and eukaryotic Rad3 proteins, differing from other SF1 and SF2 helicases by the presence of a large insert after the Walker A motif.

    Proteins where this domain is known:
    MAL13P1.134    PF14_0081    PFI1650w   


    PS51194 - HELICASE_CTER (Prosite link)

    Interpro entry IPR001650 : DNA/RNA helicase, C-terminal (Interpro link)

    Interpro description:

    The domain, which defines this group of proteins is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.

    The eukaryotic translation initiation factor 4A (eIF4A) is a member of the DEA(D/H)-box RNA helicase family This is a diverse group of proteins that couples an ATPase activity to RNA binding and unwinding. The structure of the carboxyl-terminal domain of eIF4A has been determined to 1.75 A resolution; it has a parallel alpha-beta topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases.

    Proteins where this domain is known:
    MAL13P1.14    MAL13P1.166    MAL13P1.216    MAL13P1.322    MAL7P1.113    MAL7P1.201    MAL8P1.19    MAL8P1.65    PF08_0042    PF08_0048    PF08_0096    PF08_0111    PF08_0126    PF10_0209    PF10_0232    PF10_0294    PF10_0309    PF10_0369    PF11_0053    PF11_0077    PF13_0037    PF13_0077    PF13_0177    PF13_0308    PF14_0183    PF14_0185    PF14_0234    PF14_0278    PF14_0370    PF14_0429    PF14_0437    PF14_0563    PF14_0655    PFA0180w    PFB0445c    PFB0730w    PFB0860c    PFC0440c    PFC0915w    PFC0955w    PFD0245c    PFD0565c    PFD1060w    PFD1070w    PFE0205w    PFE0215w    PFE0430w    PFE0925c    PFE1085w    PFE1390w    PFF0100w    PFF0225w    PFF1140c    PFF1185w    PFF1500c    PFI0165c    PFI0480w    PFI0860c    PFI0910w    PFL0100c    PFL1310c    PFL1525c    PFL2010c    PFL2440w    PFL2475w   


    PS51195 - Q_MOTIF (Prosite link)

    Interpro entry IPR014014 : (Interpro link)

    Interpro description:

    RNA helicases from the DEAD-box family are found in almost all organisms and have important roles in RNA metabolism such as splicing, RNA transport, ribosome biogenesis, translation and RNA decay. They are enzymes that unwind double-stranded RNA molecules in an energy dependent fashion through the hydrolysis of NTP. DEAD-box RNA helicases belong to superfamily 2 (SF2) of helicases. As other SF1 and SF2 members they contain seven conserved motifs which are characteristic of these two superfamilies. DEAD-box is named after the amino acids of motif II or Walker B (Mg2+-binding aspartic acid). Besides these seven motifs, DEAD-box RNA helicases contain a conserved cluster of nine amino-acids (the Q motif) with an invariant glutamine located N-terminally of motif I. An additional highly conserved but isolated aromatic residue is also found upstream of these nine residues. The Q motif is characteristic of and unique to DEAD box family of helicases. It is supposed to control ATP binding and hydrolysis, and therefore it represents a potential mechanism for regulating helicase activity.

    Several structural analyses of DEAD-box RNA helicases have been reported . The Q motif is located in close proximity to motif I. The conserved glutamine and aromatic residues interact with the ADP molecule.

    Some proteins known to contain a Q motif:

    This entry represents a region stretching from the conserved aromatic residue to one amino acid after the glutamine of the Q motif.

    Proteins where this domain is known:
    MAL13P1.166    MAL7P1.113    MAL8P1.19    PF08_0096    PF08_0111    PF13_0177    PF14_0183    PF14_0436    PF14_0563    PF14_0655    PFB0445c    PFB0860c    PFC0915w    PFD1070w    PFE0205w    PFE0215w    PFE0430w    PFE0925c    PFE1085w    PFE1390w    PFF1500c    PFL1310c    PFL2010c    PFL2475w   


    PS51198 - UVRD_HELICASE_ATP_BIND (Prosite link)

    Interpro entry IPR014016 : Helicase, superfamily 1, UvrD-related (Interpro link)

    Interpro description:

    Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs have been identified which are distributed over two structural domains, an N-terminal ATP-binding domain and a C-terminal domain. UvrD-like DNA helicases belong to SF1, but they differ from classical SF1/SF2 (see by a large insertion in each domain. UvrD-like DNA helicases unwind DNA with a 3'-5' polarity.

    Crystal structures of several uvrD-like DNA helicases have been solved . They are monomeric enzymes consisting of two domains with a common alpha-beta RecA-like core. The ATP-binding site is situated in a cleft between the N-terminus of the ATP-binding domain and the beginning of the C-terminal domain. The enzyme crystallizes in two different conformations (open and closed). The conformational difference between the two forms comprises a large rotation of the end of the C-terminal domain by approximately 130°. This "domain swiveling" was proposed to be an important aspect of the mechanism of the enzyme.

    Some proteins that belong to the UvrD-like DNA helicase family are listed below:

    This entry represents the ATP-binding domain found in UvrD-like helicases.

    Proteins where this domain is known:
    PFE0705c   


    PS51199 - SF4_HELICASE (Prosite link)

    Interpro entry IPR007694 : DNA helicase, DnaB-like, C-terminal (Interpro link)

    Interpro description:

    The hexameric helicase DnaB unwinds the DNA duplex at the Escherichia coli chromosome replication fork. Although the mechanism by which DnaB both couples ATP hydrolysis to translocation along DNA and denatures the duplex is unknown, a change in the quaternary structure of the protein involving dimerization of the N-terminal domain has been observed and may occur during the enzymatic cycle. This C-terminal domain contains an ATP-binding site and is therefore probably the site of ATP hydrolysis.

    Proteins where this domain is known:
    PF14_0112   


    PS51203 - CS (Prosite link)

    Interpro entry IPR007052 : (Interpro link)

    Interpro description:
    The function of the CS domain is unknown. The CS domain is sometimes found C-terminal to the CHORD domain in metazoan proteins, but occurs separately from the CHORD domain in plants. This association is thought to be indicative of an functional interaction between CS and CHORD domains.

    Proteins where this domain is known:
    MAL8P1.96    PF13_0204    PF14_0465    PF14_0510    PF14_0696    PFC0581w    PFC0715c    PFI0990c    PFI1325w    PFL1845c   


    PS51204 - HSA (Prosite link)

    Interpro entry IPR014012 : (Interpro link)

    Interpro description:

    The helicase/SANT-associated (HSA) domain is a predicted DNA-binding domain of ~75 amino acids, which is found in the eukaryotic SRCAP/p400/DOM and SNF2/brahma families. While each family has the core sequences that define the HSA domain, they each also have additional sequences that distinguish these families from one another. For example, the sequence HWDY(L/C)EEEM(Q/V) is found in the SRCAP/p400/DOM family, whereas the sequence HQE(Y/F)LNSILQ is found in the SNF2 /brahma family. In addition to the SANT and helicase domains, the HSA domain is also found in association with the bromo domain.

    Proteins where this domain is known:
    PF08_0048   


    PS51205 - VPS9 (Prosite link)

    Interpro entry IPR003123 : (Interpro link)

    Interpro description:
    This domain is present in yeast vacuolar sorting protein 9 and other proteins.

    Proteins where this domain is known:
    MAL8P1.82    PF11_0403   


    PS51214 - IBB (Prosite link)

    Interpro entry IPR002652 : Importin-alpha-like, importin-beta-binding region (Interpro link)

    Interpro description:

    The exchange of macromolecules between the nucleus and cytoplasm takes place through nuclear pore complexes within the nuclear membrane. Active transport of large molecules through these pore complexes require carrier proteins, called karyopherins (importins and exportins), which shuttle between the two compartments.

    Members of the importin-alpha (karyopherin-alpha) family can form heterodimers with importin-beta. As part of a heterodimer, importin-beta mediates interactions with the pore complex, while importin-alpha acts as an adaptor protein to bind the nuclear localisation signal (NLS) on the cargo through the classical NLS import of proteins. Proteins can contain one (monopartite) or two (bipartite) NLS motifs. Importin-alpha contains several armadillo (ARM) repeats, which produce a curving structure with two NLS-binding sites, a major one close to the N-terminus and a minor one close to the C-terminus.

    Ran GTPase helps to control the unidirectional transfer of cargo. The cytoplasm contains primarily RanGDP and the nucleus RanGTP through the actions of RanGAP and RanGEF, respectively. In the nucleus, RanGTP binds to importin-beta within the importin/cargo complex, causing a conformational change in importin-beta that releases it from importin-alpha-bound cargo. The N-terminal importin-beta-binding (IBB) domain of importin-alpha contains an auto-regulatory region that mimics the NLS motif. The release of importin-beta frees the auto-regulatory region on importin-alpha to loop back and bind to the major NLS-binding site, causing the cargo to be released.

    This entry represents the N-terminal IBB domain of importin-alpha that contains the auto-regulatory region.

    More information about these proteins can be found at Protein of the Month: Importins.

    Proteins where this domain is known:
    PF08_0087   


    PS51215 - AWS (Prosite link)

    Interpro entry IPR006560 : AWS (Interpro link)

    Interpro description:

    This domain, Associated With SET, of unknown function is found in eukaryotic proteins of unknown function. This domain, as the name suggests, is often found in association with the SET domain, suggesting a role in gene regulation by methylation of lysine residues in histones and other proteins.

    Proteins where this domain is known:
    MAL13P1.122   


    PS51217 - UVRD_HELICASE_CTER (Prosite link)

    Interpro entry IPR014017 : DNA helicase, UvrD-like, C-terminal (Interpro link)

    Interpro description:

    Helicases have been classified in 5 superfamilies (SF1-SF5). All of the proteins bind ATP and, consequently, all of them carry the classical Walker A (phosphate-binding loop or P-loop) see and Walker B (Mg2+-binding aspartic acid) motifs. For the two largest groups, commonly referred to as SF1 and SF2, a total of seven characteristic motifs have been identified which are distributed over two structural domains, an N-terminal ATP-binding domain and a C-terminal domain.

    This entry represents the C-terminal domain.

    UvrD-like DNA helicases belong to SF1, but they differ from classical SF1/SF2 by a large insertion in each domain. UvrD-like DNA helicases unwind DNA with a 3'-5' polarity. Crystal structures of several uvrD-like DNA helicases have been solved. They are monomeric enzymes consisting of two domains with a common alpha-beta RecA-like core. The ATP-binding site is situated in a cleft between the N-terminus of the ATP-binding domain and the beginning of the C-terminal domain. The enzyme crystallizes in two different conformations (open and closed). The conformational difference between the two forms comprises a large rotation of the end of the C-terminal domain by approximately 130°. This "domain swiveling" was proposed to be an important aspect of the mechanism of the enzyme.

    Some proteins that belong to the uvrD-like DNA helicase family are listed below:

    Proteins where this domain is known:
    PFE0705c   


    PS51219 - DPCK (Prosite link)

    Interpro entry IPR001977 : Dephospho-CoA kinase (Interpro link)

    Interpro description:

    This family contains dephospho-CoA kinases, which catalyzes the final step in CoA biosynthesis, the phosphorylation of the 3'-hydroxyl group of ribose using ATP as a phosphate donor.

    The crystal structures of a number of the proteins in this entry have been determined, including the structure of the protein from Haemophilus influenzae to 2.0-A resolution in a comlex with ATP. The protein consists of three domains: the nucleotide-binding domain with a five-stranded parallel beta-sheet, the substrate-binding alpha-helical domain, and the lid domain formed by a pair of alpha-helices; the overall topology of the protein resembles the structures of other nucleotide kinases.

    Proteins where this domain is known:
    PF14_0415   


    PS51221 - TTL (Prosite link)

    Interpro entry IPR004344 : Tubulin-tyrosine ligase (Interpro link)

    Interpro description:

    Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). Tubulin-tyrosine ligase (TTL) catalyses the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. The true physiological function of TTL has so far not been established. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness.

    3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis.

    Proteins where this domain is known:
    PF10_0094    PF11_0481    PFE0700c   


    PS51228 - ACB_2 (Prosite link)

    Interpro entry IPR000582 : Acyl-CoA-binding protein, ACBP (Interpro link)

    Interpro description:

    Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.

    ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.

    Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.

    The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.

    Other proteins containing an ACB domain include:

    Proteins where this domain is known:
    PF08_0099    PF10_0015    PF10_0016    PF11_0197    PF14_0749   


    PS51230 - EB1_C (Prosite link)

    Interpro entry IPR004953 : EB1, C-terminal (Interpro link)

    Interpro description:

    A group of microtubule-associated proteins called +TIPs (plus end tracking proteins), including EB1 (end-binding protein 1) family proteins, label growing microtubules ends specifically in diverse organisms and are implicated in spindle dynamics, chromosome segregation, and directing microtubules toward cortical sites. EB1 members have a bipartite composition: the N-terminal CH domain mediates microtubule plus end localization and a C-terminal cargo binding domain (EB1-C) that captures cell polarity determinants. The EB1-C domain comprises a unique EB1-like sequence motif that acts as a binding site for other +TIP proteins. It interacts with the carboxy terminus of the adenomatous polyposis coli (APC) tumor suppressor, a well conserved +TIP phosphoprotein with a pivotal function in cell cycle regulation. Another binding partner of the EB1-C domain is the well conserved +TIP protein dynactin, a component of the large cytoplasmic dynein/dynactin complex.

    The ~80-residue EB1-C domain starts with a long smoothly curved helix (alpha1), which is followed by a hairpin connection leading to a short second helix (alpha2) running antiparallel to alpha1. The two parallel alpha1 helices of the EB1-C domain dimer wrap around each other in a slightly left-handed supercoil. The two alpha2 helices run antiparallel to helices alpha1 and form a similar fork in the opposite orientation and rotated by 90°. As a result, two helical segments from each monomer form a four-helix bundle. The side chain forming the hydrophobic core of this bundle are highly conserved.

    Some protein known to contain an EB1-C domain are listed below:

    Proteins where this domain is known:
    PFC0305w   


    PS51257 - PROKAR_LIPOPROTEIN (Prosite link)

    Proteins where this domain is known:
    MAL13P1.154    MAL13P1.259    PF08_0136b    PF10_0045    PF10_0164    PF10_0323    PF11_0176    PF14_0014    PF14_0729    PFA0490w    PFA0635c    PFC0435w    PFD0930w    PFE1525w    PFF0450c    PFI0140w    PFI0500w    PFL2315c   


    PS51263 - ADF_H (Prosite link)

    Interpro entry IPR002108 : Actin-binding, cofilin/tropomyosin type (Interpro link)

    Interpro description:

    The actin-depolymerising factor homology (ADF-H) domain is an ~150-amino acid motif that is present in three phylogenetically distinct classes of eukaryotic actin-binding proteins:

    Although these proteins are biochemically distinct and play different roles in actin dynamics, they all appear to use the ADF-H domain for their interactions with actin.

    The ADF-H domain consists of a six-stranded mixed beta-sheet in which the four central strands (beta2-beta5) are anti-parallel and the two edge strands (beta1 and beta6) run parallel with the neighbouring strands. The sheet is surrounded by two alpha-helices on each side .

    Proteins where this domain is known:
    PF13_0326    PFE0165w   


    PS51266 - ZF_CHY (Prosite link)

    Interpro entry IPR008913 : Zinc finger, CHY-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    Pirh2 is an eukaryotic ubiquitin protein ligase, which has been shown to promote p53 degradation in mammals. Pirh2 physically interacts with p53 and promotes ubiquitination of p53 independently of MDM2. Like MDM2, Pirh2 is thought to participate in an autoregulatory feedback loop that controls p53 function. Pirh2 proteins contain three distinct zinc fingers, the CHY-type, the CTCHY-type which is C-terminal to the CHY-type zinc finger and a RING finger. The CHY-type zinc finger has no currently known function.

    As well as Pirh2, the CHY-type zinc finger is also found in the following proteins:

    The solution structure of this zinc finger has been solved and binds 3 zinc atoms as shown in the following schematic representation:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers

    Proteins where this domain is known:
    PFI1480w   


    PS51273 - GATASE_TYPE_1 (Prosite link)

    Interpro entry IPR017926 : (Interpro link)

    Interpro description:

    Glutamine amidotransferase (GATase) enzymes catalyse the removal of the ammonia group from glutamine and then transfer this group to a substrate to form a new carbon-nitrogen group. The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. Two classes of GATase domains have been identified: class-I (also known as trpG-type or triad) and class-II (also known as purF-type or Ntn). Class-I (or type 1) GATase domains have been found in the following enzymes:

    A triad of conserved Cys-His-Glu forms the active site, wherein the catalytic cysteine is essential for the amidotransferase activity. Different structures show that the active site Cys of type 1 GATase is located at the tip of a nucleophile elbow.

    Proteins where this domain is known:
    PF10_0123    PF13_0044    PF14_0100    PFI1100w   


    PS51278 - GATASE_TYPE_2 (Prosite link)

    Interpro entry IPR017932 : (Interpro link)

    Interpro description:

    A large group of biosynthetic enzymes are able to catalyse the removal of the ammonia group from glutamine and then to transfer this group to a substrate to form a new carbon-nitrogen group. This catalytic activity is known as glutamine amidotransferase (GATase). The GATase domain exists either as a separate polypeptidic subunit or as part of a larger polypeptide fused in different ways to a synthase domain. On the basis of sequence similarities two classes of GATase domains have been identified: class-I (also known as trpG-type or triad) and class-II (also known as purF-type or Ntn). Class-II (or type 2) GATase domains have been found in the following enzymes:

    The active site is formed by a cysteine present at the N-terminal extremity of the mature form of all these enzymes. Two other conserved residues, Asn and Gly, form an oxyanion hole for stabilisation of the formed tetrahedral intermediate. An insert of ~120 residues can occur between the conserved regions . In some class-II GATases (for example in Bacillus subtilis or chicken amido phosphoribosyltransferase) the enzyme is synthesised with a short propeptide which is cleaved off post-translationally by a proposed autocatalytic mechanism. Nuclear-encoded Fd-dependent gltS have a longer propeptide which may contain a chloroplast-targeting peptide in addition to the propeptide that is excised on enzyme activation.

    The 3-D structure of the GATase type 2 domain forms a four layer alpha/beta/beta/alpha architecture which consists of a fold similar to the N-terminal nucleophile (Ntn) hydrolases. These have the capacity for nucleophilic attack and the possibility of autocatalytic processing. The N-terminal position and the folding of the catalytic Cys differ strongly from the Cys-His-Glu triad which forms the active site of GATases of type 1.

    Proteins where this domain is known:
    PF10_0245    PF14_0334    PFC0395w   


    PS51279 - BCNT_C (Prosite link)

    Interpro entry IPR011421 : (Interpro link)

    Interpro description:

    Vertebrate BCNT (named after Bucentaur) protein is found in the nucleus and cytosol. Gene duplication of the ancestral BCNT gene leads to the h-type BCNT or craniofacial development protein 1 (CFDP1) gene and the ruminant-specific p97BCNT or craniofacial development protein 2 (CFDP2) gene. The h-type BCNT proteins contain a highly conserved 82-amino acid region at the C-terminus (BCNT-C) that is not present in p97BCNT. Instead ruminant p97BCNT contains a region derived from the endonuclease domain of a retrotransposable element RTE-1. In addition to h-type BCNT proteins, a BCNT-C domain is also found in Drosophila YETI, a protein that binds to a microtubule-based motor kinesin-1, and the yeast SWR1-complex protein 5 (SWC5) or AOR1 (actin overexpression resistant 1), a component of the SWR1 chromatin remodeling complex.

    Proteins where this domain is known:
    PFE0275w   


    PS51283 - DUSP (Prosite link)

    Interpro entry IPR006615 : Peptidase C19, ubiquitin-specific peptidase, DUSP domain (Interpro link)

    Interpro description:

    Deubiquitinating enzymes (DUB) form a large family of cysteine protease that can deconjugate ubiquitin or ubiquitin-like proteins (see from ubiquitin-conjugated proteins. All DUBs contain a catalytic domain surrounded by one or more subdomains, some of which contribute to target recognition. The ~120-residue DUSP (domain present in ubiquitin-specific proteases) domain is one of these specific subdomains. Single or tandem DUSP domains are located both N- and C-terminal to the ubiquitin carboxyl-terminal hydrolase catalytic core domain (see.

    The DUSP domain displays a tripod-like AB3 fold with a three-helix bundle and a three-stranded anti-parallel beta-sheet resembling the legs and seat of the tripod (see PDB:1W6V). Conserved residues are predominantly involved in hydrophobic packing interactions within the three alpha-helices. The most conserved DUSP residues, forming the PGPI motif, are flanked by two long loops that vary both in length and sequence. The PGPI motif packs against the three-helix bundle and is highly ordered.

    The function of the DUSP domain is unknown but it may play a role in protein/protein interaction or substrate recognition. This domain is associated with ubiquitin carboxyl-terminal hydrolase family 2 (MEROPS peptidase family C19). They are a family 100 to 200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast; others include:

    Proteins where this domain is known:
    PFI0225w   


    PS51284 - DOC (Prosite link)

    Interpro entry IPR004939 : Anaphase-promoting complex, subunit 10 (Interpro link)

    Interpro description:

    The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation.

    Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication.

    APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts.

    One of the subunits of the APC that is required for ubiquitination activity is APC10, a one-domain protein homologous to a sequence element, termed the DOC domain, found in several hypothetical proteins that may also mediate ubiquitination reactions, because they contain combinations of either RING finger (see, cullin (see or HECT (see domains.

    The DOC domain consists of a beta-sandwich, in which a five-stranded antiparallel beta-sheet is packed on top of a three stranded antiparallel beta-sheet, exhibiting a 'jellyroll' fold.

    Proteins known to contain a DOC domain include:

    Proteins where this domain is known:
    PFL0850w   


    PS51285 - AGC_KINASE_CTER (Prosite link)

    Interpro entry IPR000961 : AGC-kinase, C-terminal (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    The AGC (cAMP-dependent, cGMP-dependent and protein kinase C) protein kinase family embraces a collection of protein kinases that display a high degree of sequence similarity within their respective kinase domains. AGC kinase proteins are characterised by three conserved phosphorylation sites that critically regulate their function. The first one is located in an activation loop in the centre of the kinase domain. The two other phosphorylation sites are located outside the kinase domain in a conserved region on its C-terminal side, the AGC-kinase C-terminal domain. These sites serves as phosphorylation-regulated switches to control both intra- and inter-molecular interactions. Without these priming phosphorylations, the kinases are catalytically inactive.

    Several structures of the AGC-kinase C-terminal domain have been solved. The first phosphorylation site is located in a turn motif, the second one at the end of the domain in an hydrophobic pocket. In PKB the phosphorylated hydrophobic motif engages a hydrophobic groove within the N-lobe of the kinase domain which orders alpha helices close to the active site.

    Proteins where this domain is known:
    PF14_0346    PFI1685w    PFL2250c   


    PS51286 - RAP (Prosite link)

    Interpro entry IPR013584 : (Interpro link)

    Interpro description:

    The ~60-residue RAP (an acronym for RNA-binding domain abundant in Apicomplexans) domain is found in various proteins in eukaryotes. It is particularly abundant in apicomplexans and might mediate a range of cellular functions through its potential interactions with RNA.

    The RAP domain consists of multiple blocks of charged and aromatics residues and is predicted to be composed of alpha helical and beta strand structures. Two predicted loop regions that are dominated by glycine and tryptophan residues are found before and after the central beta sheet. Some proteins known to contain a RAP domain are listed below:

    Proteins where this domain is known:
    MAL7P1.23    PF08_0070    PF10_0064    PF10_0291    PF11_0153    PF11_0247    PF13_0292    PF14_0509    PF14_0673    PFA0255c    PFE0800w    PFE0905w    PFE1295c    PFF0235c    PFF1400w    PFI1045w    PFL1280w   


    PS51292 - ZF_RING_CH (Prosite link)

    Interpro entry IPR011016 : Zinc finger, RING-CH-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    The RING finger is a well characterised zinc finger which coordinates two zinc atoms in a cross-braced manner (see. According to the pattern of cysteines and histidines three different subfamilies of RING finger can be defined. The classical RING finger (RING-HC) has a histidine at the fourth coordinating position and a cysteine at the fifth. In the RING-H2 variant, both the fourth and fifth positions are occupied by histidines. The RING-CH, which is very similar to the classical RING finger, differs from both of these variants in that it has a cys residue in the fourth position and a His in the fifth. Another difference between the RING-CH and the common RING variants is a somewhat longer peptide segment between the fourth and fifth zinc-coordinating residues. The RING-CH zinc finger has thus the same arrangement of cysteine and histidine (C4HC3) as the PHD zinc finger (see but it contains features (spacing between the cysteines and the histidine) characteristic of the genuine RING-finger (C3HC4). The RING-CH-type is an E3 ligase mainly found in proteins associated to membranes.

    The solution structure of the RING-CH-type zinc finger of the herpesvirus Mir1 protein has shown that it is an outlying relative of the cellular RING finger domain family, with its polypeptide backbone much more closely resembling that of RING domains than PHD domains. The only real difference between the classic and variant RING domains, other than the alteration of zinc ligands, is the loss of the small beta-sheet found in RING domains and the replacement of one strand of this sheet with a single turn of helix. Some proteins that contains a RING-CH-type zinc finger are listed below:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.405    PFI0470w   


    PS51293 - SANT (Prosite link)

    Interpro entry IPR017884 : (Interpro link)

    Interpro description:

    The myb family can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module and the myb-like domain that can be involved in either of these functions.

    The SANT domain is a motif of ~50 amino acids present in proteins involved in chromatin-remodelling and transcription regulation. This eukaryotic domain was identified in nuclear receptor co-repressors and named after switching-defective protein 3 (Swi3), adaptor 2 (Ada2), nuclear receptor co-repressor (N-CoR) and transcription factor (TF)IIIB. Although SANT domains show remarkable sequence and structural similarity to the DNA-binding helix-turn-helix (HTH) domain of the myb-like tandem repeat, their function is not DNA binding. Instead, SANT domains are protein-protein interaction modules and some can bind to histone tails (e.g. in Ada2 and SMRT). SANT domains are found in combination with other domains, such as the SWIRM domain (see, the ZZ-type zinc finger (see, the C2H2-type zinc finger (see, the GATA-type zinc finger (see, the MPN-domain and DEAH ATP-helicase domain (see. The SANT domain was proposed to function as a histone-interaction module that couples histone-tail binding to enzyme catalysis for the remodelling of nucleosomes.

    The 3-dimensional structure of the SANT domain forms three alpha helices (see PDB:1OFC) similar to the DNA-binding myb-type HTH domain. Because of the strong resemblance, the SANT domain can also be detected as a myb-like "DNA-binding" domain. Most SANT domains have acidic amino acids at the start of helix 2 and in helix 3, while myb-like DNA-binding domains have more positively charged residues, in particular in their third 'recognition' helix. The bulky aromatic and hydrophobic residues in the centre of helix 3 that are incompatible with DNA contacts of myb-like DNA-binding domains form another distinguishing property of SANT domains.

    Proteins where this domain is known:
    PF10_0143    PF11_0241    PFL0815w    PFL1215c   


    PS51294 - HTH_MYB (Prosite link)

    Interpro entry IPR017930 : (Interpro link)

    Interpro description:

    The myb family can be classified into three groups: the myb-type HTH domain, which binds DNA, the SANT domain, which is a protein-protein interaction module and the myb-like domain that can be involved in either of these functions.

    The myb-type HTH domain is a DNA-binding, helix-turn-helix (HTH) domain of ~55 amino acids, typically occurring in a tandem repeat in eukaryotic transcription factors. The domain is named after the retroviral oncogene v-myb, and its cellular counterpart c-myb, which encode nuclear DNA-binding proteins that specifically recognize the sequence YAAC(G/T)G. Myb proteins contain three tandem repeats of 51 to 53 amino acids, termed R1, R2 and R3. This repeat region is involved in DNA-binding and R2 and R3 bind directly to the DNA major groove. The major part of the first repeat is missing in retroviral v-Myb sequences and in plant myb-related (R2R3) proteins. A single myb-type HTH DNA-binding domain occurs in TRF1 and TRF2. The 3D-structure of the myb-type HTH domain forms three alpha-helices. The second and third helices connected via a turn comprise the helix-turn-helix motif. Helix 3 is termed the recognition helix as it binds the DNA major groove, like in other HTHs.

    Proteins where this domain is known:
    PF10_0327   


    PS51296 - RIESKE (Prosite link)

    Interpro entry IPR017941 : Rieske [2Fe-2S] iron-sulphur domain (Interpro link)

    Interpro description:

    There are multiple types of iron-sulphur clusters which are grouped into three main categories based on their atomic content: [2Fe-2S], [3Fe-4S], [4Fe-4S] (see, and other hybrid or mixed metal types. Two general types of [2Fe-2S] clusters are known and they differ in their coordinating residues. The ferredoxin-type [2Fe-2S] clusters are coordinated to the protein by four cysteine residues (see. The Rieske-type [2Fe-2S] cluster is coordinated to its protein by two cysteine residues and two histidine residues.

    The structure of several Rieske domains has been solved. It contains three layers of antiparallel beta sheets forming two beta sandwiches. Both beta sandwiches share the central sheet 2. The metal-binding site is at the top of the beta sandwich formed by the sheets 2 and 3. The Fe1 iron of the Rieske cluster is coordinated by two cysteines while the other iron Fe2 is coordinated by two histidines. Two inorganic sulphide ions bridge the two iron ions forming a flat, rhombic cluster.

    Rieske-type iron-sulphur clusters are common to electron transfer chains of mitochondria and chloroplast and to non-haem iron oxygenase systems:

    Proteins where this domain is known:
    PF07_0085    PF14_0373   


    PS51309 - PABC (Prosite link)

    Interpro entry IPR002004 : Polyadenylate-binding protein/Hyperplastic disc protein (Interpro link)

    Interpro description:

    The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains. PABP recognises the 3' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail.

    Proteins where this domain is known:
    PFL1170w   


    PS51319 - TFIIS_N (Prosite link)

    Interpro entry IPR017923 : (Interpro link)

    Interpro description:

    Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

    TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    The TFIIS N-terminal domain is a compact four-helix bundle. The hydrophobic core residues of helices 2, 3, and 4 are well conserved among TFIIS domains, although helix 1 is less conserved.

    Proteins where this domain is known:
    PF07_0057    PF11_0093    PFI0285w   


    PS51321 - TFIIS_CENTRAL (Prosite link)

    Interpro entry IPR003618 : Transcription elongation factor S-II, central region (Interpro link)

    Interpro description:

    Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

    TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.

    Proteins where this domain is known:
    PF07_0057    PF11_0289   


    PS51324 - ERV_ALR (Prosite link)

    Interpro entry IPR017905 : (Interpro link)

    Interpro description:

    The ~100-residue ERV/ALR sulphydryl oxidase domain is a versatile module adapted for catalysis of disulphide bond formation in various organelles and biological settings. The ERV/ALR sulphydryl oxidase domain has a Cys-X-X-Cys dithiol/disulphide motif adjacent to a bound FAD cofactor, enabling transfer of electrons from thiol substrates to non-thiol electron acceptors. ERV/ALR family members differ in their N- or C-terminal extensions, which typically contain at least one additional disulphide bond, the hypothesised 'shuttle' disulphide. In yeast ERV1, a mitochondrial enzyme, the shuttle disulphide is N-terminal to the catalytic core; in yeast ERV2, present in the endoplasmic reticulum, it is C-terminal. The N- and C-terminal extensions can be entire domains, such as the thioredoxin-like domains or short segments that do not seem to be distinct domains. Proteins of the ERV/ALR family are encoded by all eukaryotes and cytoplasmic DNA viruses (poxviruses, African swine fever virus, iridoviruses, and Paramecium bursaria Chlorella virus 1).

    The ERV/ALR sulphydryl oxidase domain contains a four-helix bundle (helices alpha1-alpha4) and an additional single turn of helix (alpha5) packed perpendicular to the bundle. The FAD prosthetic group is housed at the mouth of the 4-helix bundle and communicates with the pair of juxtaposed cysteine residues that form the proximal redox active site.

    Proteins where this domain is known:
    PFA0500w    PFL2020c   


    PS51329 - C_CAP_COFACTOR_C (Prosite link)

    Interpro entry IPR017901 : (Interpro link)

    Interpro description:

    The C-CAP/cofactor C-like domain is present in several cytoskeleton-related proteins, which also contain a number of additional domains:

    The cyclase-associated protein C-CAP/cofactor C-like domain binds G-actin and is responsible for oligomerisation of the entire CAP molecule, whereas the XRP2 C-CAP/cofactor C-like domain is required for binding of ADP ribosylation factor-like protein 3 (Arl3).

    The central core of the C-CAP/cofactor C-like domain is composed of six coils of right-handed parallel beta-helices, termed coils 1-6, which form an elliptical barrel with a tightly packed interior. Each beta-helical coil is composed of three relatively short beta-strands, designated a-c, separated by sharp turns. Flanking the central beta-helical core is an N-terminal beta-strand, beta0, that packs antiparallel to the core, and strand beta7 packs antiparallel to the core near the C-terminal end of the parallel beta-helix .

    Proteins where this domain is known:
    PFA0260c    PFL0165c   


    PS51330 - DHFR_2 (Prosite link)

    Interpro entry IPR001796 : Dihydrofolate reductase region (Interpro link)

    Interpro description:

    Dihydrofolate reductase (DHFR) catalyses the NADPH-dependent reduction of dihydrofolate to tetrahydrofolate, an essential step in de novo synthesis both of glycine and of purines and deoxythymidine phosphate (the precursors of DNA synthesis), and important also in the conversion of deoxyuridine monophosphate to deoxythymidine monophosphate. Although DHFR is found ubiquitously in prokaryotes and eukaryotes, and is found in all dividing cells, maintaining levels of fully reduced folate coenzymes, the catabolic steps are still not well understood.

    Bacterial species possesses distinct DHFR enzymes (based on their pattern of binding diaminoheterocyclic molecules), but mammalian DHFRs are highly similar. The active site is situated in the N-terminal half of the sequence, which includes a conserved Pro-Trp dipeptide; the tryptophan has been shown to be involved in the binding of substrate by the enzyme. Its central role in DNA precursor synthesis, coupled with its inhibition by antagonists such as trimethoprim and methotrexate, which are used as anti-bacterial or anti-cancer agents, has made DHFR a target of anticancer chemotherapy. However, resistance has developed against some drugs, as a result of changes in DHFR itself.

    Proteins where this domain is known:
    PFD0830w   


    PS51344 - HTH_TFE_IIE (Prosite link)

    Interpro entry IPR017919 : Transcription factor TFE/TFIIEalpha, HTH domain (Interpro link)

    Interpro description:

    Initiation of eukaryotic mRNA transcription requires melting of promoter DNA with the help of the general transcription factors TFIIE and TFIIH. In higher eukaryotes, the general transcription factor TFIIE consists of two subunits: the large alpha subunit and the small beta. TFIIE beta has been found to bind to the region where the promoter starts to open to be single-stranded upon transcription initiation by RNA polymerase II. The approximately 120-residue central core domain of TFIIE beta plays a role in double-stranded DNA binding of TFIIE.

    The TFIIE beta central core DNA-binding domain consists of three helices with a beta hairpin at the C-terminus, resembling the winged helix proteins. It shows a novel double-stranded DNA-binding activity where the DNA-binding surface locates on the opposite side to the previously reported winged helix motif by forming a positively charged furrow.

    Archaea contain a TFIIE homolog, called TFE, which corresponds to the N-terminal half of TFIIEalpha. It appears that archaeal TFE corresponds to the minimal essential region of eukaryotic TFIIEalpha. In archaea TFE contains an N-terminal, weakly conserved, helix-turn-helix (HTH) motif within a leucine-rich region and a C-terminal zinc ribbon. It has been proposed that the TFE/IIEalpha-type HTH domain acts as a bridging factor or adapter between the TATA box-binding protein, the polymerase, and possibly promoter DNA.

    The TFE/IIEalpha-type HTH domain adopts a winged HTH (winged helix) fold, comprising three alpha-helices and three beta-strands in the canonical order alpha1-beta1-alpha2-alpha3-beta2-beta3. Conserved residues within helices alpha1-alpha3 form the tightly packed hydrophobic core of the winged helix domain. A specific feature of the structure is the extension of the canonical winged helix fold at the N and C termini by the additional helices alpha0 and alpha4, respectively. Hydrophobic residues from the additional helix alpha0 extend the hydrophobic core of the winged helix domain, and helix alpha0 is tightly packed against the canonical winged helix fold. Helix alpha4 comprises only one turn.

    Proteins where this domain is known:
    MAL7P1.86   


    PS51352 - THIOREDOXIN_2 (Prosite link)

    Interpro entry IPR017936 : Thioredoxin-like (Interpro link)

    Interpro description:

    Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .

    Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.

    A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:

    Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:

    This entry represents the thioredoxin domain and homologous domains in other proteins.

    Proteins where this domain is known:
    MAL13P1.225    MAL8P1.17    PF08_0131    PF11_0099    PF11_0286    PF11_0352    PF13_0272    PF14_0368    PF14_0545    PF14_0694    PFI1250w    PFL0725w   


    PS51354 - GLUTAREDOXIN_2 (Prosite link)

    Interpro entry IPR002109 : Glutaredoxin (Interpro link)

    Interpro description:

    Glutaredoxins, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system.

    Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.

    Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.

    This entry represents Glutaredoxin.

    Proteins where this domain is known:
    PF07_0036    PFC0205c    PFC0271c    PFF0340c   


    PS51355 - GLUTATHIONE_PEROXID_3 (Prosite link)

    Interpro entry IPR000889 : Glutathione peroxidase (Interpro link)

    Interpro description:

    Glutathione peroxidase (GSHPx) is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function has been shown to be evolutionary related to those of GSHPx's.

    In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine.

    Proteins where this domain is known:
    PFL0595c   


    PS51359 - COX5B_2 (Prosite link)

    Interpro entry IPR002124 : Cytochrome c oxidase, subunit Vb (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion. Two of these cysteines are clustered in the C-terminal section of the subunit.

    Proteins where this domain is known:
    PFI1365w   


    PS51363 - W2 (Prosite link)

    Interpro entry IPR003307 : (Interpro link)

    Interpro description:

    This entry represents the W2 domain (two invariant tryptophans) and is a region of ~165 amino acids which is found in the C-terminus of the following eIFs:

    Translation initiation is a sophisticated, well regulated and highly coordinated cellular process in eukaryotes, in which at least 11 eukayrotic initiation factors (eIFs) are included.

    The W2 domain has a globular fold and is exclusively composed out of alpha-helices. The structure can be divided into a structural C-terminal core onto which the two N-terminal helices are attached. The core contains two aromatic/acidic residue-rich regions (AA boxes), which are important for mediating protein-protein interactions.

    The entry covers the entire W2 domain.

    Proteins where this domain is known:
    PFL0335c    PFL0675c   


    PS51366 - MI (Prosite link)

    Interpro entry IPR003891 : (Interpro link)

    Interpro description:

    This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids. It appears in several translation factors and is found in:

    The MI domain consists of seven alpha-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the alpha-helical axes.

    The MI domain has also been named MA3 domain.

    Proteins where this domain is known:
    PF14_0113    PF14_0546    PFL1855w   


    PS51371 - CBS (Prosite link)

    Interpro entry IPR000644 : (Interpro link)

    Interpro description:

    CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes.

    Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking, while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites.

    Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations.

    In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).

    Proteins where this domain is known:
    PFI1020c    PFI1560c   


    PS51375 - PPR (Prosite link)

    Interpro entry IPR002885 : (Interpro link)

    Interpro description:

    This entry represents the PPR repeat.

    Pentatricopeptide repeat (PPR) proteins are characterised by tandem repeats of a degenerate 35 amino acid motif. Most of PPR proteins have roles in mitochondria or plastid. PPR repeats were discovered while screening Arabidopsis proteins for those predicted to be targeted to mitochondria or chloroplast. Some of these proteins have been shown to play a role in post-transcriptional processes within organelles and they are thought to be sequence-specific RNA-binding proteins. Plant genomes have between one hundred to five hundred PPR genes per genome whereas non-plant genomes encode two to six PPR proteins.

    Although no PPR structures are yet known, the motif is predicted to fold into a helix-turn-helix structure similar to those found in the tetratricopeptide repeat (TPR) family (see.

    The plant PPR protein family has been divided in two subfamilies on the basis of their motif content and organisation.

    Examples of PPR repeat-containing proteins include PET309 which may be involved in RNA stabilisation, and crp1, which is involved in RNA processing. The repeat is associated with a predicted plant proteinthat has a domain organisation similar to the human BRCA1 protein.

    Proteins where this domain is known:
    PF14_0061    PFD0690c   


    PS51379 - 4FE4S_FER_2 (Prosite link)

    Interpro entry IPR017896 : (Interpro link)

    Interpro description:

    Ferredoxins are a group of iron-sulphur proteins which mediate electron transfer in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s). One of these subgroups are the 4Fe-4S ferredoxins, which are found in bacteria and which are thus often referred as 'bacterial-type' ferredoxins. The structure of these proteins consists of the duplication of a domain of twenty six amino acid residues; each of these domains contains four cysteine residues that bind to a 4Fe-4S centre.

    Several structures of the 4Fe-4S ferredoxin domain have been determined. The clusters consist of two interleaved 4Fe- and 4S-tetrahedra forming a cubane-like structure, in such a way that the four iron occupy the eight corners of a distorted cube. Each 4Fe-4S is attached to the polypeptide chain by four covalent Fe-S bonds involving cysteine residues.

    A number of proteins have been found that include one or more 4Fe-4S binding domains similar to those of bacterial-type ferredoxins.

    The pattern of cysteine residues in the iron-sulphur region is sufficient to detect this class of 4Fe-4S binding proteins. This entry represents the whole domain.

    Note:In some bacterial ferredoxins, one of the two duplicated domains has lost one or more of the four conserved cysteines. The consequence of such variations is that these domains have either lost their iron-sulphur binding property or bind to a 3Fe-3S centre instead of a 4Fe-4S centre.

    Proteins where this domain is known:
    MAL13P1.344    PFL0630w   


    PS51380 - EXS (Prosite link)

    Interpro entry IPR004342 : EXS, C-terminal (Interpro link)

    Interpro description:

    The EXS domain is named after ERD1/XPR1/SYG1 and proteins containing this motif include the C-terminal of the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be Murine leukemia virus (MLV) receptors (XPR1. The N-terminal of these proteins often have an SPX domain.

    While the N-terminal is thought to be involved in signal transduction, the role of the C-terminal is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) S. cerevisiae proteins. ERD1 proteins are involved in the localization of endogenous endoplasmic reticulum (ER) proteins. Erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localization label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via 'salvage' vesicles.

    Proteins where this domain is known:
    PFF0365c   


    PS51382 - SPX (Prosite link)

    Interpro entry IPR004331 : (Interpro link)

    Interpro description:

    The SPX domain is named after SYG1/Pho81/XPR1 proteins. This 180 residue length domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G- protein beta subunit and inhibits transduction of the mating pheromone signal suggesting that all the members of this family are involved in G-protein associated signal transduction. The C-terminal of these proteins often have an EXS domain.

    The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. NUC-2 contains several ankyrin repeats.

    Several members of this family are the XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with Murine leukemia virus (MLV). The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor.

    Proteins where this domain is known:
    PFB0440c    PFF0365c    PFF1075w    PFL1455w   


    PS51383 - YJEF_C_3 (Prosite link)

    Interpro entry IPR000631 : (Interpro link)

    Interpro description:

    This family is related to Hydroxyethylthiazole kinaseand PfkB carbohydrate kinaseimplying that it also a carbohydrate kinase.

    Several uncharacterised proteins have been shown to share regions of similarities, including yeast chromosome XI hypothetical protein YKL151c; Caenorhabditis elegans hypothetical protein R107.2; Escherichia coli hypothetical protein yjeF; Bacillus subtilis hypothetical protein yxkO; Helicobacter pylori hypothetical protein HP1363; Mycobacterium tuberculosis hypothetical protein MtCY77.05c; Mycobacterium leprae hypothetical protein B229_C2_201; Synechocystis sp. (strain PCC 6803) hypothetical protein sll1433; and Methanocaldococcus jannaschii (Methanococcus jannaschii) hypothetical protein MJ1586. These are proteins of about 30 to 40 kDa whose central region is well conserved.

    Proteins where this domain is known:
    PF11_0453   


    PS51384 - FAD_FR (Prosite link)

    Interpro entry IPR017927 : Ferredoxin reductase-type FAD-binding domain (Interpro link)

    Interpro description:

    Flavoenzymes have the ability to catalyse a wide range of biochemical reactions. They are involved in the dehydrogenation of a variety of metabolites, in electron transfer from and to redox centres, in light emission, in the activation of oxygen for oxidation and hydroxylation reactions. About 1% of all eukaryotic and prokaryotic proteins are predicted to encode a flavin adenine dinucleotide (FAD)-binding domain.

    According to structural similarities and conserved sequence motifs, FAD-binding domains have been grouped in three main families: (i)the ferredoxin reductase (FR)-type FAD-binding domain, (ii) the FAD-binding domains that adopt a Rossmann fold and (iii) the PCMH-type FAD-binding domain.

    The FAD cofactor consists of adenosine monophosphate (AMP) linked to flavin mononucleotide (FMN) by a pyrophosphate bond. The AMP moiety is composed of the adenine ring bonded to a ribose that is linked to a phosphate group. The FMN moiety is composed of the isoalloxazine-flavin ring linked to a ribitol, which is connected to a phosphate group. The flavin functions mainly in a redox capacity, being able to take up two electrons from one substrate and release them two at a time to a substrate or coenzyme, or one at a time to an electron acceptor. The catalytic function of the FAD is concentrated in the isoalloxazine ring, whereas the ribityl phosphate and the AMP moiety mainly stabilise cofactor binding to protein residues.

    The structural core of all FR family members is well conserved. The FAD-binding fold characteristic of the FR family is a cylindrical beta-domain with a flattened six-stranded antiparallel beta-barrel organised into two orthogonal sheets (B1-B2-B5 and B4-B3-B6) separated by one alpha-helix. The cylinder is open between strands B4 and B5 which makes space for the isoalloxazine and ribityl moieties of the FAD. One end of the cylinder is covered by the only helix of the domain, which is essential for the binding of the pyrophosphate groups of the FAD. The FR family contains two conserved motifs, one (R-x-Y-[ST]) located in B4 where the invariant positively charge Arg residue forms hydrogen bonds to the negative pyrophosphate oxygen atom. The other conserved sequence motif is G-x(2)-[ST]-x(2)-L-x(5)-G-x(7)-P-x-G, which is part of H1-B6 and is known as the phosphate-binding motif.

    Proteins where this domain is known:
    PF13_0353    PFF1115w   


    PS51385 - YJEF_N (Prosite link)

    Interpro entry IPR004443 : (Interpro link)

    Interpro description:

    The YjeF N-terminal domains occur either as single proteins or fusions with other domains and are commonly associated with enzymes. In bacteria and archaea, YjeF N-terminal domains are often fused to a YjeF C-terminal domain with high structural homology to the members of a ribokinase-like superfamilyand/or belong to operons that encode enzymes of diverse functions: pyridoxal phosphate biosynthetic protein PdxJ; phosphopanteine-protein transferase; ATP/GTP hydrolase; and pyruvate-formate lyase 1-activating enzyme. In plants, the YjeF N-terminal domain is fused to a C-terminal putative pyridoxamine 5'-phosphate oxidase. In eukaryotes, proteins that consist of (Sm)-FDF-YjeF N-terminal domains may be involved in RNA processing.

    The YjeF N-terminal domains represent a novel version of the Rossmann fold, one of the most common protein folds in nature observed in numerous enzyme families, that has acquired a set of catalytic residues and structural features that distinguish them from the conventional dehydrogenases. The YjeF N-terminal domain is comprised of a three-layer alpha-beta-alpha sandwich with a central beta-sheet surrounded by helices. The conservation of the acidic residues in the predicted active site of the YjeF N-terminal domains is reminiscent of the presence of such residues in the active sites of diverse hydrolases.

    Proteins where this domain is known:
    PF14_0570   


    PTHR10012 - Phstyr_phstse_ac (Panther link)

    Interpro entry IPR004327 : Phosphotyrosyl phosphatase activator, PTPA (Interpro link)

    Interpro description:
    Phosphotyrosyl phosphatase activator (PTPA) proteins stimulate the phosphotyrosyl phosphatase (PTPase) activity of the dimeric form of protein phosphatase 2A (PP2A). PTPase activity in PP2A (in vitro) is relatively low when compared to the better recognized phosphoserine/ threonine protein phosphorylase activity. The specific biological role of PTPA is unknown, Basal expression of PTPA depends on the activity of a ubiquitous transcription factor, Yin Yang 1 (YY1). The tumour suppressor protein p53 can inhibit PTPA expression through an unknown mechanism that negatively controls YY1.

    Proteins where this domain is known:
    PF14_0280   


    PTHR10025 - PTHR10025 (Panther link)

    Proteins where this domain is known:
    PFF1490w   


    PTHR10025:SF3 - PTHR10025:SF3 (Panther link)

    Proteins where this domain is known:
    PFF1490w   


    PTHR10026 - Trans_reg_cyclin (Panther link)

    Interpro entry IPR015429 : (Interpro link)

    Interpro description:

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.

    The cyclins in this entry are involved in the regulation of RNA polymerase II transcription. These proteins are highly evolutionarily conserved and can be found in species ranging from Arabidopsis thaliana (Mouse-ear cress) to Homo sapiens (Human).

    Proteins where this domain is known:
    PF13_0022    PF14_0605   


    PTHR10026:SF10 - CYCLIN T,K-RELATED (Panther link)

    Proteins where this domain is known:
    PF13_0022   


    PTHR10026:SF8 - CycH (Panther link)

    Interpro entry IPR015432 : (Interpro link)

    Interpro description:

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.

    The cyclins in this entry are involved in the regulation of RNA polymerase II transcription. Cyclin H and its associated cyclin dependent kinase, cdk 7, are components of the TFIIH complex that is involved in both transcription and DNA repair.

    This subfamily of Cyclin H proteins is found in vertebrates ranging from Xenopus laevis (African clawed frog) to Homo sapiens (Human).

    Proteins where this domain is known:
    PF14_0605   


    PTHR10031 - ATP SYNTHASE 9 MITOCHONDRIAL (Panther link)

    Proteins where this domain is known:
    MAL7P1.340   


    PTHR10046 - PTHR10046 (Panther link)

    Proteins where this domain is known:
    PF14_0147   


    PTHR10048 - PI_Kinase (Panther link)

    Interpro entry IPR015433 : Phosphatidylinositol Kinase (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    This group of proteins is comprised entirely of phosphatidylinositol 3- and phosphatidylinositol 4-kinases. Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase, PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3), function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) acts on phosphatidylinositol (PI) in the first committed step of the production of the secondary messenger inositol-1,4,5-trisphosphate. The PI3- and PI4-kinases share a well-conserved domain at their C-terminal section, which is distantly related to the catalytic domain of protein kinases. The catalytic domain of PI3K has a bilobal structure with a small N-terminal lobe and a large C-terminal lobe; this structure is often found in other ATP-dependent kinases. The core of this catalytic domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to other protein kinases, the PI3K loop interacts with the phosphates of the ATP, known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.

    Synonym(s): PIK

    Proteins where this domain is known:
    PFE0485w    PFE0765w   


    PTHR10048:SF15 - PTHR10048:SF15 (Panther link)

    Proteins where this domain is known:
    PFE0485w   


    PTHR10048:SF7 - PTHR10048:SF7 (Panther link)

    Proteins where this domain is known:
    PFE0765w   


    PTHR10050 - PTHR10050 (Panther link)

    Proteins where this domain is known:
    PF10_0104   


    PTHR10052 - PTHR10052 (Panther link)

    Proteins where this domain is known:
    PF13_0224   


    PTHR10052:SF1 - Ribosomal_L18ae (Panther link)

    Interpro entry IPR002670 : Ribosomal protein L18ae (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L18ae forms part of the 60S ribosomal subunit. This family is found in eukaryotes. Rat ribosomal protein L18 is homologous to Xenopus laevis L14.

    Proteins where this domain is known:
    PF13_0224   


    PTHR10055 - Trp_tRNA-synt_1b (Panther link)

    Interpro entry IPR002306 : Tryptophanyl-tRNA synthetase, class Ib (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Tryptophanyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. The crystal structure of tryptophanyl-tRNA synthetase is known.

    Proteins where this domain is known:
    PF13_0205    PFL2485c   


    PTHR10060 - TatD_DNase (Panther link)

    Interpro entry IPR015992 : Deoxyribonuclease, TatD (Interpro link)

    Interpro description:
    This family of proteins are related to a large superfamily of metalloenzymes. TatD, a member of this family has been shown experimentally to be a DNase enzyme. Allantoinase N-isopropylammelide isopropyl amidohydrolaseand the SCN1 protein from fission yeast belong to this family.

    Proteins where this domain is known:
    PFA0580c   


    PTHR10064 - Ribosomal_L22e (Panther link)

    Interpro entry IPR002671 : Ribosomal protein L22e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L22e forms part of the 60S ribosomal subunit. This family is found in eukaryotes. Rattus norvegicus (Rat) L22 is related to ribosomal proteins from other eukaryotes and is identical in amino acid sequence to human EAP, the EBER 1 (Epstein-Barr virus (strain GD1) (HHV-4) (Human herpesvirus 4) encoded RNA) associated protein.

    Proteins where this domain is known:
    PF08_0039   


    PTHR10067 - PS_decarb (Panther link)

    Interpro entry IPR005221 : Phosphatidylserine decarboxylase (Interpro link)

    Interpro description:

    Phosphatidylserine decarboxylase is synthesized as a single chain precursor. Generation of the pyruvoyl active site from a Ser is coupled to cleavage of a Gly-Ser bond between the larger (beta) and smaller (alpha chains). It is an integral membrane protein.

    Proteins where this domain is known:
    PFI1370c   


    PTHR10072 - HesB_yadR_yfhF (Panther link)

    Interpro entry IPR000361 : (Interpro link)

    Interpro description:

    The proteins in this entry are variously annotated as iron-sulphur cluster insertion protein or Fe/S biogenesis protein. They appear to be involved in Fe-S cluster biogenesis. This family includes IscA, HesB, YadR and YfhF-like proteins. The hesB gene is expressed only under nitrogen fixation conditions. IscA, an 11 kDa member of the hesB family of proteins, binds iron and [2Fe-2S] clusters, and participates in the biosynthesis of iron-sulphur proteins. IscA is able to bind at least 2 iron ions per dimer. Other members of this family include various hypothetical proteins that also contain the NifU-like domain suggesting that they too are able to bind iron and are involved in Fe-S cluster biogenesis. The HesB family are found in species as divergent as Homo sapiens (Human) and Haemophilus influenzae suggesting that these proteins are involved in basic cellular functions.

    Proteins where this domain is known:
    PFB0320c    PFC1005c    PFE1135w   


    PTHR10072:SF28 - PTHR10072:SF28 (Panther link)

    Proteins where this domain is known:
    PFC1005c    PFE1135w   


    PTHR10072:SF6 - PTHR10072:SF6 (Panther link)

    Proteins where this domain is known:
    PFB0320c   


    PTHR10073 - DNA_mis_repair (Panther link)

    Interpro entry IPR002099 : DNA mismatch repair protein (Interpro link)

    Interpro description:

    This entry represents DNA mismatch repair proteins, such as MutL. The dimeric MutL protein has a key function in communicating mismatch recognition by MutS to downstream repair processes. Mismatch repair contributes to the overall fidelity of DNA replication by targeting mispaired bases that arise through replication errors during homologous recombination and as a result of DNA damage. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex.

    Proteins where this domain is known:
    MAL7P1.145    PF11_0184   


    PTHR10073:SF11 - MLH1 (Panther link)

    Interpro entry IPR011186 : DNA mismatch repair protein Mlh1 (Interpro link)

    Interpro description:

    Mismatch repair is one of five major DNA repair pathways, the others being homologous recombination repair, non-homologous end joining, nucleotide excision repair, and base excision repair. The mismatch repair system recognises and repairs mispaired or unpaired nucleotides that result from errors in DNA replication. Many proteins involved in the different repair processes also play a role in apoptosis when DNA damage is excessive, thereby helping to prevent carcinogenesis. The mismatch repair protein, Mlh1 (mutL homologue 1), has a dual role in DNA repair and apoptosis. Mlh1 acts as a heterodimer in conjunction with Pms2, Pms1 (post-meiotic segregation 1 and 2) or Mlh3 (MutL homologue 3), which function as adaptor proteins that link Msh (MutS homologue) heterodimers to the DNA repair machinery, resulting in excision and repair of the mispaired base.

    Proteins where this domain is known:
    PF11_0184   


    PTHR10073:SF9 - PMS2 (Panther link)

    Interpro entry IPR015434 : (Interpro link)

    Interpro description:

    The Post-Meiotic Segregation 2 (PMS2) protein is a component of the MMR (Mis-Match Repair) complex involved in DNA repair. In Homo sapiens (Human), PMS2 forms an alpha heterodimer with the MLH1 protein. The gene was originally identified as having a low chromosome segregation defect in meiosis in Saccharomyces cerevisiae (Baker's yeast), in which organism MLH1/PMS2 is a single gene. Germline mutations in the PMS2 gene have been shown to give rise to Turcot syndrome, which is the co-occurrence of a primary brain tumour and multiple colorectal adenomas. It has also been shown that in families having a history of hereditary nonpolyposis colorectal cancer (HNPCC), PMS2 was seen to have large internal deletions in the gene. It was later shown that HNPCC patients having germline mutations either in PMS2 or MHL1 had a much higher rate of chromosomal mutations compared to control individuals.

    Proteins where this domain is known:
    MAL7P1.145   


    PTHR10074 - PTHR10074 (Panther link)

    Proteins where this domain is known:
    PF11_0059    PF14_0260    PF14_0387    PFB0275w    PFE0825w    PFE1455w   


    PTHR10074:SF22 - PTHR10074:SF22 (Panther link)

    Proteins where this domain is known:
    PFB0275w   


    PTHR10074:SF29 - PTHR10074:SF29 (Panther link)

    Proteins where this domain is known:
    PF14_0387   


    PTHR10074:SF56 - PTHR10074:SF56 (Panther link)

    Proteins where this domain is known:
    PFE0825w   


    PTHR10093 - NIF_FeS_clus_asmbl_NifU_N (Panther link)

    Interpro entry IPR002871 : NIF system FeS cluster assembly, NifU, N-terminal (Interpro link)

    Interpro description:

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.

    This entry represents the N-terminal of NifU and homologous proteins. NifU contains two domains: an N-terminal and a C-terminal domain. These domains exist either together or on different polypeptides, both domains being found in organisms that do not fix nitrogen (e.g. yeast), so they have a broader significance in the cell than nitrogen fixation.

    Proteins where this domain is known:
    PF14_0518   


    PTHR10102 - RNA_pol_phage (Panther link)

    Interpro entry IPR002092 : DNA-directed RNA polymerase, bacteriophage type (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    This is a family of single chain polymerases, which are evolutionary related, and which are related to the T3/T7 bacteriophage polymerases.

    Proteins where this domain is known:
    PF11_0264   


    PTHR10108 - METHYLTRANSFERASE (Panther link)

    Proteins where this domain is known:
    MAL13P1.214    PFB0220w    PFD0350w   


    PTHR10108:SF16 - PHOSPHOETHANOLAMINE N-METHYLTRANSFERASE (Panther link)

    Proteins where this domain is known:
    MAL13P1.214   


    PTHR10108:SF2 - PTHR10108:SF2 (Panther link)

    Proteins where this domain is known:
    PFD0350w   


    PTHR10108:SF24 - UbiE/COQ5mtfrase (Panther link)

    Interpro entry IPR004033 : UbiE/COQ5 methyltransferase (Interpro link)

    Interpro description:
    A number of methyltransferases have been shown to share regions of similarities. Apart from the ubiquinone/menaquinone biosynthesis methyltransferases (for example, the C-methyltransferase from the ubiE gene of Escherichia coli), the ubiquinone biosynthesis methyltransferases (for example, the C-methyltransferase from the COQ5 gene of Saccharomyces cerevisiae) and the menaquinone biosynthesis methyltransferases (for example, the C-methyltransferase from the MENH gene of Bacillus subtilis), this family also includes methyltransferases involved in biotin and sterol biosynthesis and in phosphatidylethanolamine methylation.

    Proteins where this domain is known:
    PFB0220w   


    PTHR10110 - Cation/H_exchanger_cons-reg (Panther link)

    Interpro entry IPR018422 : Cation/H+ exchanger, conserved region (Interpro link)

    Interpro description:

    Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia. Human NHE is also involved in heart disease, cell growth and in cell differentiation. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9). These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport.

    This entry represents a conserved region found in a number of cation/proton exchangers, including Na+/H+ exchangers, K+/H+ exchangers and Na+(K+,Li+,Rb+)/H+ exchangers.

    Proteins where this domain is known:
    PF13_0019   


    PTHR10110:SF6 - Na/H_exchanger_put_cons-reg (Panther link)

    Interpro entry IPR018420 : (Interpro link)

    Interpro description:

    Sodium proton exchangers (NHEs) constitute a large family of integral membrane protein transporters that are responsible for the counter-transport of protons and sodium ions across lipid bilayers. These proteins are found in organisms across all domains of life. In archaea, bacteria, yeast and plants, these exchangers provide increased salt tolerance by removing sodium in exchanger for extracellular protons. In mammals they participate in the regulation of cell pH, volume, and intracellular sodium concentration, as well as for the reabsorption of NaCl across renal, intestinal, and other epithelia. Human NHE is also involved in heart disease, cell growth and in cell differentiation. The removal of intracellular protons in exchange for extracellular sodium effectively eliminates excess acid from actively metabolising cells. In mammalian cells, NHE activity is found in both the plasma membrane and inner mitochondrial membrane. To date, nine mammalian isoforms have been identified (designated NHE1-NHE9). These exchangers are highly-regulated (glyco)phosphoproteins, which, based on their primary structure, appear to contain 10-12 membrane-spanning regions (M) at the N-terminus and a large cytoplasmic region at the C-terminus. The transmembrane regions M3-M12 share identity with other members of the family. The M6 and M7 regions are highly conserved. Thus, this is thought to be the region that is involved in the transport of sodium and hydrogen ions. The cytoplasmic region has little similarity throughout the family. There is some evidence that the exchangers may exist in the cell membrane as homodimers, but little is currently known about the mechanism of their antiport.

    This entry represents a conserved region found in putative Na+/H+ exchanger proteins from Apicomplexa.

    Proteins where this domain is known:
    PF13_0019   


    PTHR10113 - eRF1 (Panther link)

    Interpro entry IPR004403 : Peptide chain release factor eRF/aRF subunit 1 (Interpro link)

    Interpro description:
    These proteins are translation factors that have been characterised in eukaryotes as the non-GTP-binding subunit of a cytosolic heterodimer that acts as a translation release factor for all three stop codons. Members of this orthologous family are found in Eukarya and Archaea. The name used should be eRF1 for the Archaea and aRF1 for the Eukarya. Alternative names include eRF1, SUP45, omnipotent suppressor protein 1.

    Proteins where this domain is known:
    PFB0550w   


    PTHR10113:SF1 - PTHR10113:SF1 (Panther link)

    Proteins where this domain is known:
    PFB0550w   


    PTHR10114 - Ribosomal_L36e (Panther link)

    Interpro entry IPR000509 : Ribosomal protein L36e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The L36E ribosomal family consists of mammalian, Caenorhabditis elegans and Drosophila L36, Candida albicans L39, and yeast YL39 ribosomal proteins.

    Proteins where this domain is known:
    PF11_0106   


    PTHR10119 - Glu_tRNA-synt_1c (Panther link)

    Interpro entry IPR000924 : Glutamyl/glutaminyl-tRNA synthetase, class Ic (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Glutamyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutaminyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. To date one crystal structure of a glutamyl-tRNA synthetase (Thermus thermophilus) has been solved. The molecule has the form of a bent cylinder and consists of four domains. The N-terminal half (domains 1 and 2) contains the 'Rossman fold' typical for class I synthetases and resembles the corresponding part of Escherichia coli GlnRS, whereas the C-terminal half exhibits a GluRS-specific structure.

    Proteins where this domain is known:
    MAL13P1.281    PF13_0170    PF13_0257   


    PTHR10119:SF1 - PTHR10119:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.281   


    PTHR10119:SF3 - GlnS (Panther link)

    Interpro entry IPR004514 : Glutaminyl-tRNA synthetase, class Ic (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Glutaminyl-tRNA synthetase is a class Ic synthetase and shows several similarities with glutamyl-tRNA synthetase concerning structure and catalytic properties. It is an alpha2 dimer. Glutaminyl-tRNA synthetase is a relatively rare synthetase, found in the cytosolic compartment of eukaryotes, in Escherichia coli and a number of other Gram-negative bacteria, and in Deinococcus radiodurans. In contrast, the pathway to Gln-tRNA in mitochondria, Archaea, Gram-positive bacteria, and a number of other lineages is by misacylation with Glu followed by transamidation to correct the aminoacylation to Gln.

    Proteins where this domain is known:
    PF13_0170   


    PTHR10119:SF7 - PTHR10119:SF7 (Panther link)

    Proteins where this domain is known:
    PF13_0257   


    PTHR10121 - COATOMER DELTA SUBUNIT (Panther link)

    Proteins where this domain is known:
    PF11_0359   


    PTHR10122 - COX5B (Panther link)

    Interpro entry IPR002124 : Cytochrome c oxidase, subunit Vb (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptidic subunits. One of these subunits, which is known as Vb in mammals, V in Dictyostelium discoideum (Slime mold) and IV in yeast, binds a zinc atom. The sequence of subunit Vb is well conserved and includes three conserved cysteines that coordinate the zinc ion. Two of these cysteines are clustered in the C-terminal section of the subunit.

    Proteins where this domain is known:
    PFI1365w   


    PTHR10126 - TBP (Panther link)

    Interpro entry IPR000814 : TATA-box binding (Interpro link)

    Interpro description:

    The TATA-box binding protein (TBP) is required for the initiation of transcription by RNA polymerases I, II and III, from promoters with or without a TATA box. TBP associates with a host of factors, including the general transcription factors TFIIA, -B, -D, -E, and -H, to form huge multi-subunit pre-initiation complexes on the core promoter. Through its association with different transcription factors, TBP can initiate transcription from different RNA polymerases. There are several related TBPs, including TBP-like (TBPL) proteins.

    The C-terminal core of TBP (~180 residues) is highly conserved and contains two 77-amino acid repeats that produce a saddle-shaped structure that straddles the DNA; this region binds to the TATA box and interacts with transcription factors and regulatory proteins . By contrast, the N-terminal region varies in both length and sequence.

    Proteins where this domain is known:
    PF14_0267    PFE0305w   


    PTHR10126:SF4 - PTHR10126:SF4 (Panther link)

    Proteins where this domain is known:
    PFE0305w   


    PTHR10133 - PTHR10133 (Panther link)

    Proteins where this domain is known:
    PFB0180w    PFF1225c   


    PTHR10134 - Rieske (Panther link)

    Interpro entry IPR014349 : (Interpro link)

    Interpro description:

    Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems. It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential which is linked to ATP synthesis.

    The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.

    The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.

    Proteins where this domain is known:
    PF14_0373   


    PTHR10137 - ATPase_V1_c (Panther link)

    Interpro entry IPR004907 : ATPase, V1 complex, subunit C (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    This entry represents the C subunit that is part of the V1 complex, and is localised to the interface between the V1 and V0 complexes. This subunit does not show any homology with F-ATPase subunits. The C subunit plays an essential role in controlling the assembly of V-ATPase, acting as a flexible stator that holds together the catalytic (V1) and membrane (V0) sectors of the enzyme . The release of subunit C from the ATPase complex results in the dissociation of the V1 and V0 subcomplexes, which is an important mechanism in controlling V-ATPase activity in cells.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PFA0300c   


    PTHR10139 - DNA_repair (Panther link)

    Interpro entry IPR003701 : DNA repair exonuclease (Interpro link)

    Interpro description:

    Mre11 and Rad50 are two proteins required for DNA repair and meiosis-specific double-strand break formation in Saccharomyces cerevisiae. Mre11 by itself has 3' to 5' exonuclease activity that is increased when Mre11 is in a complex with Rad50.

    These eukaryotic proteins contain one metallo-phosphoesterase domain followed by an Mre11 DNA-binding domain. S. cerevisiae Mre11 is required for DNA repair and meiosis-specific double-strand break (DSB) formation and has both 3' to 5' exonuclease activity (which increases when in complex with Rad50) and endonuclease activity. The N-terminal phosphoesterase domain is required for DSB repair, and the carboxyl-terminal dsDNA-binding domain is essential during meiosis for chromatin modification and DSB formation. Schizosaccharomyces pombe rad32 is required for repair of double strand breaks and recombination.

    For additional information please see.

    Proteins where this domain is known:
    PFA0390w   


    PTHR10142 - XPA (Panther link)

    Interpro entry IPR000465 : XPA (Interpro link)

    Interpro description:
    Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. Skin cells of individualÂs with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-A is the most severe form of the disease and is due to defects in a 30 kDa nuclear protein called XPA (or XPAC). The sequence of the XPA protein is conserved from higher eukaryotes to yeast (gene RAD14). XPA is a hydrophilic protein of 247 to 296 amino-acid residues which has a C4-type zinc finger motif in its central section.

    Proteins where this domain is known:
    MAL7P1.32   


    PTHR10146 - PP_YBL036C (Panther link)

    Interpro entry IPR011078 : (Interpro link)

    Interpro description:

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.

    Proteins in this entry occur in archaea, bacteria and eukaryotes. They are encoded by genes which are often co-transcribed with proline biosysnthesis genes, although their function in vivo has not yet been demonstrated.

    The structure of the yeast protein YBL036C has been determined to a resolution of 2.0 A. Similar in structure to the N-terminal domains of alanine racemase and ornithine decarboxylase, it forms a TIM barrel fold which begins with a long N-terminal helix, rather than the classical beta strand found at the beginning of most other TIM barrels. Unlike alanine racemase and ornithine decarboxylase, which are two-domain dimeric proteins, the yeast protein is a single domain monomer. A pyridoxal 5'-phosphate cofactor is covalently bound towards the C-terminal end of the barrel, which is the usual active site in TIM-barrel folds. Some racemase activity was observed for this protein and it was suggested by the authors that it may function as a general racemase.

    Proteins where this domain is known:
    PFI0965w   


    PTHR10150 - PTHR10150 (Panther link)

    Proteins where this domain is known:
    MAL13P1.346   


    PTHR10159 - PTHR10159 (Panther link)

    Proteins where this domain is known:
    PF11_0281    PF14_0524    PF14_0525    PFC0380w   


    PTHR10159:SF23 - PTHR10159:SF23 (Panther link)

    Proteins where this domain is known:
    PF11_0281   


    PTHR10159:SF6 - PTHR10159:SF6 (Panther link)

    Proteins where this domain is known:
    PFC0380w   


    PTHR10160 - PTHR10160 (Panther link)

    Proteins where this domain is known:
    PF14_0508   


    PTHR10160:SF2 - PTHR10160:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0508   


    PTHR10161 - PTHR10161 (Panther link)

    Proteins where this domain is known:
    PF14_0614    PFI0880c   


    PTHR10161:SF1 - PTHR10161:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0880c   


    PTHR10168 - GLUTAREDOXIN (Panther link)

    Proteins where this domain is known:
    PFC0271c   


    PTHR10168:SF12 - GLUTAREDOXIN, GRX (Panther link)

    Proteins where this domain is known:
    PFC0271c   


    PTHR10169 - PTHR10169 (Panther link)

    Proteins where this domain is known:
    PF14_0316    PFL1915w   


    PTHR10169:SF2 - PTHR10169:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0316   


    PTHR10169:SF3 - PTHR10169:SF3 (Panther link)

    Proteins where this domain is known:
    PFL1915w   


    PTHR10174 - PTHR10174 (Panther link)

    Proteins where this domain is known:
    PF11_0287    PFF1280w   


    PTHR10174:SF6 - PTHR10174:SF6 (Panther link)

    Proteins where this domain is known:
    PF11_0287    PFF1280w   


    PTHR10183 - CALPAIN (Panther link)

    Proteins where this domain is known:
    MAL13P1.310   


    PTHR10183:SF22 - CALPAIN 7 (Panther link)

    Proteins where this domain is known:
    MAL13P1.310   


    PTHR10196 - FGGY_kin (Panther link)

    Interpro entry IPR000577 : Carbohydrate kinase, FGGY (Interpro link)

    Interpro description:
    It has been shown that four different type of carbohydrate kinases seem to be evolutionary related. These enzymes include L-fucolokinase (gene fucK); gluconokinase (gene gntK); glycerol kinase (gene glpK); xylulokinase (gene xylB); and L-xylulose kinase (gene lyxK). These enzymes are proteins of from 480 to 520 amino acid residues.

    Proteins where this domain is known:
    PF13_0269   


    PTHR10196:SF9 - Glycerol_kin (Panther link)

    Interpro entry IPR005999 : Glycerol kinase (Interpro link)

    Interpro description:

    Glycerol kinase is a bacterial sugar kinase which catalyzes the Mg-ATP-dependent phosphorylation of glycerol to yield glycerol 3-phosphate. The enzyme from Escherichia coli is an allosteric regulatory enzyme whose activity is inhibited by fructose 1,6-bisphosphate (FBP) and the glucose-specific phosphocarrier of the phosphoenolpyruvate:glycose phosphotransferase system, IIA(Glc), structural studies suggest a nucleophilic in-line transfer mechanism for the ATP-dependent phosphorylation of glycerol by glycerol kinase.

    Proteins where this domain is known:
    PF13_0269   


    PTHR10199 - PTHR10199 (Panther link)

    Proteins where this domain is known:
    PFA0200w   


    PTHR10199:SF6 - PTHR10199:SF6 (Panther link)

    Proteins where this domain is known:
    PFA0200w   


    PTHR10210 - PTHR10210 (Panther link)

    Proteins where this domain is known:
    PF13_0143    PF13_0157   


    PTHR10210:SF14 - PTHR10210:SF14 (Panther link)

    Proteins where this domain is known:
    PF13_0143   


    PTHR10210:SF4 - PTHR10210:SF4 (Panther link)

    Proteins where this domain is known:
    PF13_0157   


    PTHR10211 - DNA_photolyase_2 (Panther link)

    Interpro entry IPR008148 : DNA photolyase, class 2 (Interpro link)

    Interpro description:

    Deoxyribodipyrimidine photolyase (DNA photolyase) is a DNA repair enzyme. It binds to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a near-UV photon (300 to 500 nm), breaks the cyclobutane ring joining the two pyrimidines of the dimer. DNA photolyase is an enzyme that requires two choromophore-cofactors for its activity: a reduced FADH2 and either 5,10-methenyltetrahydrofolate (5,10-MTFH) or an oxidized 8-hydroxy-5- deazaflavin (8-HDF) derivative (F420). The folate or deazaflavin chromophore appears to function as an antenna, while the FADH2 chromophore is thought to be responsible for electron transfer. On the basis of sequence similarities DNA photolyases can be grouped into two classes.

    The second class contains enzymes from Myxococcus xanthus, methanogenic archaebacteria, insects, fish and marsupial mammals. It is not yet known what second cofactor is bound to class 2 enzymes. There are a number of conserved sequence regions in all known class 2 DNA photolyases, especially in the C-terminal part.

    Proteins where this domain is known:
    PFE0675c   


    PTHR10219 - PTHR10219 (Panther link)

    Proteins where this domain is known:
    PFI0775w   


    PTHR10223 - PTHR10223 (Panther link)

    Proteins where this domain is known:
    PF08_0109   


    PTHR10229 - PTHR10229 (Panther link)

    Proteins where this domain is known:
    PF11_0143   


    PTHR10231 - Nuc_sug_transpt (Panther link)

    Interpro entry IPR007271 : Nucleotide-sugar transporter (Interpro link)

    Interpro description:

    This family of membrane proteins transport nucleotide sugars from the cytoplasm into golgi vesicles.transports CMP-sialic acid,transports UDP-galactose andtransports UDP-GlcNAc. This family has some but not complete overlap with the UDP-galactose transporter family

    Proteins where this domain is known:
    PFE0260w   


    PTHR10233 - IF-2B_related (Panther link)

    Interpro entry IPR000649 : Initiation factor 2B related (Interpro link)

    Interpro description:

    Initiation factor 2 binds to Met-tRNA, GTP and the small ribosomal subunit. The eukaryotic translation initiation factor EIF-2B is a complex made up of five different subunits, alpha, beta, gamma, delta and epsilon, and catalyses the exchange of EIF-2-bound GDP for GTP. This family includes initiation factor 2B alpha, beta and delta subunits from eukaryotes; related proteins from archaebacteria and IF-2 from prokaryotes and also contains a subfamily of proteins in eukaryotes, archaeae (e.g. Pyrococcus furiosus), or eubacteria such as Bacillus subtilis and Thermotoga maritima. Many of these proteins were initially annotated as putative translation initiation factors despite the fact that there is no evidence for the requirement of an IF2 recycling factor in prokaryotic translation initiation. Recently, one of these proteins from B. subtilis has been functionally characterised as a 5-methylthioribose-1-phosphate isomerase (MTNA). This enzyme participates in the methionine salvage pathway catalysing the isomerisation of 5-methylthioribose-1-phosphate to 5-methylthioribulose-1-phosphate. The methionine salvage pathway leads to the synthesis of methionine from methylthioadenosine, the end product of the spermidine and spermine anabolism in many species.

    Proteins where this domain is known:
    PF08_0009    PF10_0136    PF13_0126    PFB0460c    PFL1985c    PFL2430c   


    PTHR10233:SF2 - PTHR10233:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0126   


    PTHR10233:SF3 - PTHR10233:SF3 (Panther link)

    Proteins where this domain is known:
    PFL1985c   


    PTHR10233:SF4 - PTHR10233:SF4 (Panther link)

    Proteins where this domain is known:
    PFB0460c   


    PTHR10233:SF7 - PTHR10233:SF7 (Panther link)

    Proteins where this domain is known:
    PF08_0009   


    PTHR10233:SF8 - PTHR10233:SF8 (Panther link)

    Proteins where this domain is known:
    PF10_0136   


    PTHR10233:SF9 - PTHR10233:SF9 (Panther link)

    Proteins where this domain is known:
    PFL2430c   


    PTHR10242 - PTHR10242 (Panther link)

    Proteins where this domain is known:
    PFI0835c   


    PTHR10245 - PTHR10245 (Panther link)

    Proteins where this domain is known:
    PF11_0293   


    PTHR10245:SF1 - PTHR10245:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0293   


    PTHR10252 - PTHR10252 (Panther link)

    Proteins where this domain is known:
    PF14_0374   


    PTHR10252:SF6 - PTHR10252:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0374   


    PTHR10256 - PTHR10256 (Panther link)

    Proteins where this domain is known:
    PFI0505c   


    PTHR10261 - PTHR10261 (Panther link)

    Proteins where this domain is known:
    PF11_0463   


    PTHR10263 - VACUOLAR ATP SYNTHASE PROTEOLIPID SUBUNIT (Panther link)

    Proteins where this domain is known:
    MAL13P1.271    PFE0965c   


    PTHR10264 - PTHR10264 (Panther link)

    Proteins where this domain is known:
    PFC0800w   


    PTHR10266 - Cyt_C1 (Panther link)

    Interpro entry IPR002326 : Cytochrome c1 (Interpro link)

    Interpro description:
    Cytochrome bc1 complex (ubiquinol:ferricytochrome c oxidoreductase) is found in mitochondria, photosynthetic bacteria and other prokaryotes. It is minimally composed of three subunits: cytochrome b, carrying a low- and a high-potential haem group; cytochrome c1 (cyt c1); and a high-potential Rieske iron-sulphur protein. The general function of the complex is electron transfer between two mobile redox carriers, ubiquinol and cytochrome c; the electron transfer is coupled with proton translocation across the membrane, thus generating proton-motive force in the form of an electrochemical potential that can drive ATP synthesis. In its structure and functions, the cytochrome bc1 complex bears extensive analogy to the cytochrome b6f complex of chloroplasts and cyanobacteria; cyt c1 plays an analogous role to cytochrome f, in spite of their different structures.

    Proteins where this domain is known:
    PF14_0597   


    PTHR10281 - PTHR10281 (Panther link)

    Proteins where this domain is known:
    PF14_0714   


    PTHR10281:SF1 - PTHR10281:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0714   


    PTHR10286 - Pyrophosphatase (Panther link)

    Interpro entry IPR008162 : Inorganic pyrophosphatase (Interpro link)

    Interpro description:

    Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.

    The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.

    Proteins where this domain is known:
    PFC0710w   


    PTHR10288 - PTHR10288 (Panther link)

    Proteins where this domain is known:
    PF14_0151    PFF0250w   


    PTHR10288:SF31 - PTHR10288:SF31 (Panther link)

    Proteins where this domain is known:
    PF14_0151   


    PTHR10290 - DNA TOPOISOMERASE TYPE I (Panther link)

    Proteins where this domain is known:
    PFE0520c   


    PTHR10291 - UPP_synth (Panther link)

    Interpro entry IPR001441 : Di-trans-poly-cis-decaprenylcistransferase-like (Interpro link)

    Interpro description:

    Synonym(s): Di-trans-poly-cis-undecaprenyl-diphosphate synthase, Undecaprenyl pyrophosphate synthetase, Undecaprenyl pyrophosphate synthase, UPP synthetase

    Di-trans-poly-cis-decaprenylcistransferase (UPP synthetase) generates undecaprenyl pyrophosphate (UPP) from isopentenyl pyrophosphate (IPP). This bacterial enzyme is also found in archaebacteria and in a number of uncharacterised proteins including some from yeasts.

    This entry also matches related enzymes that transfer alkyl groups, such as dehydrodolichyl diphosphate synthase.

    Proteins where this domain is known:
    MAL8P1.22   


    PTHR10292 - PTHR10292 (Panther link)

    Proteins where this domain is known:
    PFL0930w   


    PTHR10292:SF1 - PTHR10292:SF1 (Panther link)

    Proteins where this domain is known:
    PFL0930w   


    PTHR10293 - Glutredox-rel (Panther link)

    Interpro entry IPR004480 : (Interpro link)

    Interpro description:

    Glutaredoxins, also known as thioltransferases (disulphide reductases, are small proteins of approximately one hundred amino-acid residues which utilise glutathione and NADPH as cofactors. Oxidized glutathione is regenerated by glutathione reductase. Together these components compose the glutathione system.

    Glutaredoxin functions as an electron carrier in the glutathione-dependent synthesis of deoxyribonucleotides by the enzyme ribonucleotide reductase. Like thioredoxin, which functions in a similar way, glutaredoxin possesses an active centre disulphide bond. It exists in either a reduced or an oxidized form where the two cysteine residues are linked in an intramolecular disulphide bond.

    Glutaredoxin has been sequenced in a variety of species. On the basis of extensive sequence similarity, it has been proposed that Vaccinia virus protein O2L is most probably a glutaredoxin. Finally, it must be noted that Bacteriophage T4 thioredoxin seems also to be evolutionary related. In position 5 of the pattern T4 thioredoxin has Val instead of Pro.

    This family groups a number of hypothetical proteins from different organisms which are related to glutaredoxin proteins.

    Proteins where this domain is known:
    PF07_0036    PFC0205c    PFF0340c   


    PTHR10302 - Single_strand_bd (Panther link)

    Interpro entry IPR011344 : Single-strand DNA-binding (Interpro link)

    Interpro description:

    All proteins in this family, for which functions are known, are single-stranded DNA-binding proteins that function in many processes including transcription, repair, replication and recombination. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University).

    Proteins where this domain is known:
    PFE0435c   


    PTHR10309 - PTHR10309 (Panther link)

    Proteins where this domain is known:
    MAL8P1.156   


    PTHR10317 - PTHR10317 (Panther link)

    Proteins where this domain is known:
    PFE1405c   


    PTHR10322 - DNA_pol_B (Panther link)

    Interpro entry IPR006172 : DNA-directed DNA polymerase, family B (Interpro link)

    Interpro description:

    DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation.

    A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.

    Proteins where this domain is known:
    PF10_0165    PF10_0362    PFD0590c   


    PTHR10322:SF3 - PTHR10322:SF3 (Panther link)

    Proteins where this domain is known:
    PFD0590c   


    PTHR10322:SF4 - DNA POLYMERASE DELTA CATALYTIC SUBUNIT (Panther link)

    Proteins where this domain is known:
    PF10_0165   


    PTHR10322:SF5 - PTHR10322:SF5 (Panther link)

    Proteins where this domain is known:
    PF10_0362   


    PTHR10332 - DER/eqnu_transpt (Panther link)

    Interpro entry IPR002259 : Delayed-early response protein/equilibrative nucleoside transporter (Interpro link)

    Interpro description:

    Delayed-early response (DER) gene products include growth progression factors and several unknown products of novel cDNAs. Murine and human cDNAs from one novel DER gene (DER12) have been characterised to identify its product and to examine its role in the growth response. Both sequences encode a hydrophobic 36kD protein that is predicted to contain 8 transmembrane (TM) domains. The protein has been localised to the nucleolus, where its concentration increases following mitogen stimulation.

    Although the function of the protein is unknown, its identification as a nucleolar gene transcriptionally activated by growth factors implicates it as participating in the proliferative response. Sequence analysis reveals the protein to share a high degree of similarity with the C-terminal portion of equilibrative nucleoside transporters. These proteins are integral membrane proteins which enable the movement of hydrophilic nucleosides and nucleoside analogs down their concentration gradients across cell membranes. ENT family members have been identified in humans, mice, fish, tunicates, slime molds, and bacteria.

    Proteins where this domain is known:
    PF13_0252   


    PTHR10335 - Fibrillarin (Panther link)

    Interpro entry IPR000692 : Fibrillarin (Interpro link)

    Interpro description:
    Fibrillarin is a component of a nucleolar small nuclear ribonucleoprotein (SnRNP), functioning in vivo in ribosomal RNA processing. It is associated with U3, U8 and U13 small nuclear RNAs in mammals and is similar to the yeast NOP1 protein. Fibrillarin has a well conserved sequence of around 320 amino acids, and contains 3 domains, an N-terminal Gly/Arg-rich region; a central domain resembling other RNA-binding proteins and containing an RNP-2-like consensus sequence; and a C-terminal alpha-helical domain. An evolutionarily related pre-rRNA processing protein, which lacks the Gly/Arg-rich domain, has been found in various archaebacteria.

    Proteins where this domain is known:
    PF14_0068   


    PTHR10336 - PTHR10336 (Panther link)

    Proteins where this domain is known:
    PF10_0132   


    PTHR10336:SF19 - PTHR10336:SF19 (Panther link)

    Proteins where this domain is known:
    PF10_0132   


    PTHR10344 - Thymidylate_kin (Panther link)

    Interpro entry IPR018094 : Thymidylate kinase (Interpro link)

    Interpro description:

    Thymidylate kinase (dTMP kinase) catalyzes the phosphorylation of thymidine 5'-monophosphate (dTMP) to form thymidine 5'-diphosphate (dTDP) in the presence of ATP and magnesium:

     ATP + thymidine 5'-phosphate = ADP + thymidine 5'-diphosphate  

    Thymidylate kinase is an ubiquitous enzyme of about 25 Kd and is important in the dTTP synthesis pathway for DNA synthesis. The function of dTMP kinase in eukaryotes comes from the study of a cell cycle mutant, cdc8, in Saccharomyces cerevisiae. Structural and functional analyses suggest that the cDNA codes for authentic human dTMP kinase. The mRNA levels and enzyme activities corresponded to cell cycle progression and cell growth stages.

    Proteins where this domain is known:
    PFL2465c   


    PTHR10351 - PTHR10351 (Panther link)

    Proteins where this domain is known:
    PF14_0241   


    PTHR10352 - PTHR10352 (Panther link)

    Proteins where this domain is known:
    MAL8P1.83   


    PTHR10359 - PTHR10359 (Panther link)

    Proteins where this domain is known:
    PF11_0306    PFF0715c   


    PTHR10359:SF1 - PTHR10359:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0306   


    PTHR10359:SF2 - PTHR10359:SF2 (Panther link)

    Proteins where this domain is known:
    PFF0715c   


    PTHR10366 - PTHR10366 (Panther link)

    Proteins where this domain is known:
    PF08_0077    PF10_0137   


    PTHR10366:SF25 - PTHR10366:SF25 (Panther link)

    Proteins where this domain is known:
    PF10_0137   


    PTHR10366:SF32 - GDP_mann_dehyd (Panther link)

    Interpro entry IPR006368 : GDP-mannose 4,6-dehydratase (Interpro link)

    Interpro description:

    This family represent GDP-mannose 4,6-dehydratase, also known as GDP-D-mannose dehydratase. This enzyme converts GDP-mannose to GDP-4-dehydro-6-deoxy-D-mannose, the first of three steps for the conversion of GDP-mannose to GDP-fucose in animals, plants, and bacteria. In bacteria, GDP-L-fucose acts as a precursor of surface antigens such as the extracellular polysaccharide colanic acid of Escherichia coli. Excluded from this family are members of the clade that are poorly related because of highly dervied (phylogenetically long-branch) sequences, e.g. Aneurinibacillus thermoaerophilus Gmd, described as a bifunctional GDP-mannose 4,6-dehydratase/GDP-6-deoxy-D-lyxo-4-hexulose reductase.

    Proteins where this domain is known:
    PF08_0077   


    PTHR10367 - PTHR10367 (Panther link)

    Proteins where this domain is known:
    PF14_0144   


    PTHR10369 - Ribosomal_L44e (Panther link)

    Interpro entry IPR000552 : Ribosomal protein L44e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian, Trypanosoma brucei, Caenorhabditis elegans and fungal L44, and Haloarcula marismortui LA.

    Proteins where this domain is known:
    PFC0200w   


    PTHR10374 - LACTOYLGLUTATHIONE LYASE (GLYOXALASE I) (Panther link)

    Proteins where this domain is known:
    PF11_0145   


    PTHR10381 - Pept_S14_ClpP (Panther link)

    Interpro entry IPR001907 : Peptidase S14, ClpP (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to the MEROPS peptidase family S14 (ClpP endopeptidase family, clan SK). ClpP is an ATP-dependent protease that cleaves a number of proteins, such as casein and albumin. It exists as a heterodimer of ATP-binding regulatory A and catalytic P subunits, both of which are required for effective levels of protease activity in the presence of ATP, although the P subunit alone does possess some catalytic activity. This family of sequences represent the P subunit.

    Proteases highly similar to ClpP have been found to be encoded in the genome of bacteria, metazoa, some viruses and in the chloroplast of plants. A number of the proteins in this family are classified as non-peptidase homologues as they have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.

    Proteins where this domain is known:
    PF14_0348    PFC0310c   


    PTHR10388 - PTHR10388 (Panther link)

    Proteins where this domain is known:
    PFL2095w   


    PTHR10394 - PTHR10394 (Panther link)

    Proteins where this domain is known:
    PF14_0083   


    PTHR10394:SF1 - PTHR10394:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0083   


    PTHR10408 - STEROL O-ACYLTRANSFERASE (Panther link)

    Proteins where this domain is known:
    PFC0995c   


    PTHR10410 - PTHR10410 (Panther link)

    Proteins where this domain is known:
    MAL13P1.343    PF14_0191   


    PTHR10410:SF1 - PTHR10410:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0191   


    PTHR10410:SF5 - PTHR10410:SF5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.343   


    PTHR10414 - ETHANOLAMINEPHOSPHOTRANSFERASE (Panther link)

    Proteins where this domain is known:
    PFF1375c   


    PTHR10414:SF1 - gb def: LipB protein, putative (Panther link)

    Proteins where this domain is known:
    PFF1375c   


    PTHR10416 - PTHR10416 (Panther link)

    Proteins where this domain is known:
    PFC0340w   


    PTHR10420 - PTHR10420 (Panther link)

    Proteins where this domain is known:
    MAL7P1.147    PF13_0096    PF14_0145    PFA0220w    PFD0165w    PFD0655c    PFE0835w    PFE1355c   


    PTHR10420:SF31 - PTHR10420:SF31 (Panther link)

    Proteins where this domain is known:
    PF13_0096   


    PTHR10420:SF36 - PTHR10420:SF36 (Panther link)

    Proteins where this domain is known:
    PFD0655c   


    PTHR10420:SF39 - PTHR10420:SF39 (Panther link)

    Proteins where this domain is known:
    PFE1355c   


    PTHR10420:SF43 - PTHR10420:SF43 (Panther link)

    Proteins where this domain is known:
    MAL7P1.147   


    PTHR10420:SF45 - PTHR10420:SF45 (Panther link)

    Proteins where this domain is known:
    PFA0220w   


    PTHR10420:SF51 - PTHR10420:SF51 (Panther link)

    Proteins where this domain is known:
    PF14_0145   


    PTHR10420:SF56 - PTHR10420:SF56 (Panther link)

    Proteins where this domain is known:
    PFD0165w   


    PTHR10420:SF72 - PTHR10420:SF72 (Panther link)

    Proteins where this domain is known:
    PFE0835w   


    PTHR10429 - PurDNA_glycsylse (Panther link)

    Interpro entry IPR003180 : Methylpurine-DNA glycosylase (MPG) (Interpro link)

    Interpro description:

    Methylpurine-DNA glycosylase is a base excision-repair protein. It is responsible for the hydrolysis of the deoxyribose N-glycosidic bond, excising 3-methyladenine and 3-methylguanine from damaged DNA. Its action is induced by alkylating chemotherapeutics, as well as deaminated and lipid peroxidation-induced purine adducts. MPG without an N-terminal extension excises hypoxanthine with one-third of the efficiency of full-length MPG under similar conditions, suggesting that is function may largely be attributable to the N-terminal extension.

    Proteins where this domain is known:
    PF10_0061    PF14_0639   


    PTHR10430 - PTHR10430 (Panther link)

    Proteins where this domain is known:
    MAL7P1.159   


    PTHR10430:SF1 - PTHR10430:SF1 (Panther link)

    Proteins where this domain is known:
    MAL7P1.159   


    PTHR10432 - PTHR10432 (Panther link)

    Proteins where this domain is known:
    MAL13P1.120    MAL13P1.303    MAL8P1.40    PF10_0194    PF11_0083    PF13_0315    PF13_0318    PF14_0096    PF14_0194    PFB0255w    PFD0700c    PFI0820c    PFI1025w    PFI1435w    PFL0830w    PFL1170w    PFL1745c   


    PTHR10432:SF28 - RRM_rel (Panther link)

    Interpro entry IPR015464 : (Interpro link)

    Interpro description:

    Eukaryotic single-stranded RNA-binding proteins often contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the putative RNA-binding region RNP-1 signature, or RNA recognition motif (RRM). RRMs are found in a variety of canonical RNA-binding proteins. These include heterogeneous nuclear ribonucleoproteins (hnRNPs), implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs), central players in mRNA splicing. The motif also appears in a few single stranded DNA-binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, and a third helix present during RNA-binding in some cases. The biological role of proteins classified in this subfamily is unknown.

    Proteins where this domain is known:
    PF13_0318   


    PTHR10432:SF35 - RNP_BRUNO-like (Panther link)

    Interpro entry IPR015903 : (Interpro link)

    Interpro description:

    Eukaryotic single-stranded RNA-binding proteins often contain one or more copies of a putative RNA-binding domain of approximately 90 amino acids. This is known as the putative RNA-binding region RNP-1 signature, or RNA recognition motif (RRM). RRMs are found in a variety of canonical RNA-binding proteins. These include heterogeneous nuclear ribonucleoproteins (hnRNPs), implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs), central players in mRNA splicing. The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, and a third helix present during RNA-binding in some cases.

    Bruno is an RNA-binding protein in Drosophila that acts as a translational repressor and is involved in multiple aspects of pattern formation in embryonic development. Bruno-like RNA-binding proteins, also referred to as CUG-BP, etr-like, or CELF proteins, display high homology to Bruno and contain multiple RRM domains. These types of RNA-binding proteins function in a wide variety of biological processes.

    Proteins where this domain is known:
    MAL13P1.303    PF13_0315    PF14_0096    PFI0820c    PFL1745c   


    PTHR10432:SF36 - PTHR10432:SF36 (Panther link)

    Proteins where this domain is known:
    MAL8P1.40   


    PTHR10432:SF51 - PTHR10432:SF51 (Panther link)

    Proteins where this domain is known:
    PF14_0194   


    PTHR10432:SF57 - PTHR10432:SF57 (Panther link)

    Proteins where this domain is known:
    PF10_0194   


    PTHR10432:SF58 - PTHR10432:SF58 (Panther link)

    Proteins where this domain is known:
    PFI1025w   


    PTHR10432:SF60 - CC1_SF (Panther link)

    Interpro entry IPR006509 : Splicing factor, CC1-like (Interpro link)

    Interpro description:

    These sequences represent a subfamily of RNA splicing factors including the Pad-1 protein (Neurospora crassa), CAPER (mouse) and CC1.3 (human). All are characterised by an N-terminal arginine-rich, low complexity domain followed by three (or in the case of 4 human paralogs, two) RNA recognition domains. These splicing factors are closely related to the U2AF splicing factor family.

    Proteins where this domain is known:
    MAL13P1.120   


    PTHR10432:SF64 - PTHR10432:SF64 (Panther link)

    Proteins where this domain is known:
    PFL0830w   


    PTHR10432:SF65 - RNA-BINDING PROTEIN 28 (Panther link)

    Proteins where this domain is known:
    PFB0255w   


    PTHR10432:SF69 - POLYADENYLATE-BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    PFL1170w   


    PTHR10432:SF8 - PTHR10432:SF8 (Panther link)

    Proteins where this domain is known:
    PFD0700c   


    PTHR10432:SF9 - PTHR10432:SF9 (Panther link)

    Proteins where this domain is known:
    PF11_0083   


    PTHR10434 - PTHR10434 (Panther link)

    Proteins where this domain is known:
    PF14_0421   


    PTHR10436 - PTHR10436 (Panther link)

    Proteins where this domain is known:
    PFE1080w    PFI0685w   


    PTHR10438 - Trx (Panther link)

    Interpro entry IPR015467 : Thioredoxin, core (Interpro link)

    Interpro description:

    Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of two cysteine thiol groups to a disulphide, accompanied by the transfer of two electrons and two protons. The net result is the covalent interconversion of a disulphide and a dithiol. In the NADPH-dependent protein disulphide reduction, thioredoxin reductase (TR) catalyses the reduction of oxidised thioredoxin (trx) by NADPH using FAD and its redox-active disulphide; reduced thioredoxin then directly reduces the disulphide in the substrate protein .

    Thioredoxin is present in prokaryotes and eukaryotes and the sequence around the redox-active disulphide bond is well conserved. All thioredoxins contain a cis-proline located in a loop preceding beta-strand 4, which makes contact with the active site cysteines, and is important for stability and function. Thioredoxin belongs to a structural family that includes glutaredoxin, glutathione peroxidase, bacterial protein disulphide isomerase DsbA, and the N-terminal domain of glutathione transferase. Thioredoxins have a beta-alpha unit preceding the motif common to all these proteins.

    A number of eukaryotic proteins contain domains evolutionary related to thioredoxin, most of them are protein disulphide isomerases (PDI). PDI is an endoplasmic reticulum multi-functional enzyme that catalyses the formation and rearrangement of disulphide bonds during protein folding. All PDI contains two or three (ERp72) copies of the thioredoxin domain, each of which contributes to disulphide isomerase activity, but which are functionally non-equivalent. Moreover, PDI exhibits chaperone-like activity towards proteins that contain no disulphide bonds, i.e. behaving independently of its disulphide isomerase activity. The various forms of PDI which are currently known are:

    Bacterial proteins that act as thiol:disulphide interchange proteins that allows disulphide bond formation in some periplasmic proteins also contain a thioredoxin domain. These proteins are:

    This entry represents the core thioredoxin domain.

    Proteins where this domain is known:
    MAL13P1.225    PF14_0545    PF14_0590    PFI0790w    PFI1250w   


    PTHR10438:SF16 - THIOREDOXIN H-TYPE-RELATED (Panther link)

    Proteins where this domain is known:
    PF14_0545    PFI1250w   


    PTHR10438:SF17 - PTHR10438:SF17 (Panther link)

    Proteins where this domain is known:
    MAL13P1.225   


    PTHR10438:SF18 - PTHR10438:SF18 (Panther link)

    Proteins where this domain is known:
    PFI0790w   


    PTHR10442 - Ribosomal_S21E (Panther link)

    Interpro entry IPR001931 : Ribosomal protein S21e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 82 to 87 amino acids. The amino termini are all N alpha-acetylated. The N-terminal halves of the protein molecules are highly conserved in contrast to the carboxy-terminal parts.

    Proteins where this domain is known:
    PF11_0454   


    PTHR10458 - Fmet_deformylase (Panther link)

    Interpro entry IPR000181 : Formylmethionine deformylase (Interpro link)

    Interpro description:

    Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::

     N-formyl-L-methionine + H2O = formate + methionyl peptide 
    Catalytic efficiency strongly depends on the identity of the bound metal.

    The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .

    Proteins where this domain is known:
    PFI0380c   


    PTHR10459 - ATP-DEPENDENT DNA LIGASE FAMILY (Panther link)

    Proteins where this domain is known:
    MAL13P1.22   


    PTHR10459:SF10 - DNA LIGASE I (Panther link)

    Proteins where this domain is known:
    MAL13P1.22   


    PTHR10466 - PMM (Panther link)

    Interpro entry IPR005002 : Eukaryotic phosphomannomutase (Interpro link)

    Interpro description:
    This enzyme is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions.

    Proteins where this domain is known:
    PF10_0169   


    PTHR10472 - DTyrtRNA_deacyls (Panther link)

    Interpro entry IPR003732 : D-tyrosyl-tRNA(Tyr) deacylase (Interpro link)

    Interpro description:

    This homodimeric enzyme appears able to cleave any D-amino acid (and glycine, which does not have distinct D/L forms) from charged tRNA. The name reflects characterization with respect to D-Tyr on tRNA(Tyr) as established in the literature, but substrate specificity seems much broader.

    Proteins where this domain is known:
    PF11_0095   


    PTHR10472:SF2 - PTHR10472:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0095   


    PTHR10476 - PTHR10476 (Panther link)

    Proteins where this domain is known:
    PF08_0064    PFI0300w   


    PTHR10476:SF2 - PTHR10476:SF2 (Panther link)

    Proteins where this domain is known:
    PFI0300w   


    PTHR10476:SF3 - PTHR10476:SF3 (Panther link)

    Proteins where this domain is known:
    PF08_0064   


    PTHR10483 - PTHR10483 (Panther link)

    Proteins where this domain is known:
    PF14_0061    PFF0765c    PFL1605w   


    PTHR10483:SF16 - PTHR10483:SF16 (Panther link)

    Proteins where this domain is known:
    PFF0765c   


    PTHR10484 - Histone_H4 (Panther link)

    Interpro entry IPR001951 : Histone H4 (Interpro link)

    Interpro description:
    Histone H4 is one of the four histones, along with H2A, H2B and H3, which forms the eukaryotic nucleosome core. Along with H3, it plays a central role in nucleosome formation. The sequence of histone H4 has remained almost invariant in more then 2 billion years of evolution.

    Proteins where this domain is known:
    PF11_0061   


    PTHR10485 - PTHR10485 (Panther link)

    Proteins where this domain is known:
    PF14_0328   


    PTHR10496 - Ribosomal_S24E (Panther link)

    Interpro entry IPR001976 : Ribosomal protein S24e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family contains the S24e ribosomal proteins from eukaryotes and archaebacteria. These proteins have 101 to 148 amino acids.

    Proteins where this domain is known:
    PFE0975c   


    PTHR10497 - Ribosomal_L27e (Panther link)

    Interpro entry IPR001141 : Ribosomal protein L27e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein, L27 is found in fungi, plants, algae and vertebrates. The family has a specific signature at the C terminus.

    Proteins where this domain is known:
    PF14_0579   


    PTHR10499 - PTHR10499 (Panther link)

    Proteins where this domain is known:
    PFB0770c    PFI1205c    PFI1475w    PFL1960w   


    PTHR10501 - PTHR10501 (Panther link)

    Proteins where this domain is known:
    MAL13P1.35    PFI1695c   


    PTHR10516 - PPIase_FKBP (Panther link)

    Interpro entry IPR001179 : Peptidyl-prolyl cis-trans isomerase, FKBP-type (Interpro link)

    Interpro description:

    Synonym(s): Peptidylprolyl cis-trans isomerase

    FKBP-type peptidylprolyl isomerases in vertebrates, are receptors for the two immunosuppressants, FK506 and rapamycin. The drugs inhibit T cell proliferation by arresting two distinct cytoplasmic signal transmission pathways. Peptidylprolyl isomerases accelerate protein folding by catalysing the cis-trans isomerisation of proline imidic peptide bonds in oligopeptides. These proteins are found in a variety of organisms.

    Proteins where this domain is known:
    MAL13P1.68    PF11_0124    PF13_0190    PFL2275c   


    PTHR10516:SF16 - PTHR10516:SF16 (Panther link)

    Proteins where this domain is known:
    MAL13P1.68   


    PTHR10516:SF19 - PTHR10516:SF19 (Panther link)

    Proteins where this domain is known:
    PFL2275c   


    PTHR10516:SF28 - PTHR10516:SF28 (Panther link)

    Proteins where this domain is known:
    PF11_0124   


    PTHR10516:SF8 - PTHR10516:SF8 (Panther link)

    Proteins where this domain is known:
    PF13_0190   


    PTHR10527 - PTHR10527 (Panther link)

    Proteins where this domain is known:
    PF08_0069    PFE1195w    PFF1345w   


    PTHR10527:SF1 - PTHR10527:SF1 (Panther link)

    Proteins where this domain is known:
    PF08_0069   


    PTHR10527:SF2 - PTHR10527:SF2 (Panther link)

    Proteins where this domain is known:
    PFF1345w   


    PTHR10527:SF5 - IMPORTIN BETA-3 (Panther link)

    Proteins where this domain is known:
    PFE1195w   


    PTHR10534 - PTHR10534 (Panther link)

    Proteins where this domain is known:
    PFF0775w   


    PTHR10535 - PTHR10535 (Panther link)

    Proteins where this domain is known:
    PF13_0341   


    PTHR10536 - DNA_primase_S_euk_arch (Panther link)

    Interpro entry IPR014052 : DNA primase, small subunit, eukaryotic and archaeal (Interpro link)

    Interpro description:

    DNA primase synthesizes the RNA primers for the Okazaki fragments in lagging strand DNA synthesis. DNA primase is a heterodimer of large (p60) and small (p50) subunits in eukaryotes. This family represents sequences of the small subunit and the DNA primase sequences of the Archaea. No sequence similarity can be detected between the eukaryotic p50 and p60 subunits and the primases purified from bacteriophage and bacteria.

    This entry represents the eukaryotic and archaeal proteins, and does not include viral proteins.

    Proteins where this domain is known:
    PF14_0366   


    PTHR10537 - DNA_primase_lrg_euk (Panther link)

    Interpro entry IPR007238 : DNA primase, large subunit, eukaryotic/archaeal (Interpro link)

    Interpro description:
    DNA primase is the polymerase that synthesises small RNA primers for the Okazaki fragments made during discontinuous DNA replication. DNA primase is a heterodimer of two subunits, the small subunit Pri1 (48 kDa in yeast), and the large subunit Pri2 (58 kDa in the yeast Saccharomyces cerevisiae). Both subunits participate in the formation of the active site, but the ATP binding site is located on the small subunit. Primase function has also been demonstrated for human and mouse primase subunits.

    Proteins where this domain is known:
    PFI0530c   


    PTHR10539 - PTHR10539 (Panther link)

    Proteins where this domain is known:
    PF10_0298   


    PTHR10540 - PTHR10540 (Panther link)

    Proteins where this domain is known:
    PFI0630w    PFI0895c   


    PTHR10540:SF6 - PTHR10540:SF6 (Panther link)

    Proteins where this domain is known:
    PFI0895c   


    PTHR10540:SF7 - PTHR10540:SF7 (Panther link)

    Proteins where this domain is known:
    PFI0630w   


    PTHR10544 - Ribosomal_L28e (Panther link)

    Interpro entry IPR002672 : Ribosomal protein L28e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L28e forms part of the 60S ribosomal subunit. This family is found in eukaryotes. In rat there are 9 or 10 copies of the L28 gene. The L28 protein contains a possible internal duplication of 9 residues.

    Proteins where this domain is known:
    PF11_0437   


    PTHR10548 - PTHR10548 (Panther link)

    Proteins where this domain is known:
    PF10_0047    PF10_0217    PF11_0205    PFE0865c   


    PTHR10553 - PTHR10553 (Panther link)

    Proteins where this domain is known:
    MAL8P1.48    PFL0460w   


    PTHR10553:SF1 - PTHR10553:SF1 (Panther link)

    Proteins where this domain is known:
    PFL0460w   


    PTHR10553:SF2 - PTHR10553:SF2 (Panther link)

    Proteins where this domain is known:
    MAL8P1.48   


    PTHR10555 - PTHR10555 (Panther link)

    Proteins where this domain is known:
    PF07_0017   


    PTHR10555:SF8 - PTHR10555:SF8 (Panther link)

    Proteins where this domain is known:
    PF07_0017   


    PTHR10556 - PTHR10556 (Panther link)

    Proteins where this domain is known:
    PF11_0370   


    PTHR10556:SF1 - PTHR10556:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0370   


    PTHR10562 - PTHR10562 (Panther link)

    Proteins where this domain is known:
    PFE0285c   


    PTHR10566 - PTHR10566 (Panther link)

    Proteins where this domain is known:
    PF08_0098    PF14_0143   


    PTHR10566:SF7 - PTHR10566:SF7 (Panther link)

    Proteins where this domain is known:
    PF08_0098    PF14_0143   


    PTHR10571 - PTHR10571 (Panther link)

    Proteins where this domain is known:
    PFC0935c   


    PTHR10585 - ER_ret_rcpt (Panther link)

    Interpro entry IPR000133 : ER lumen protein retaining receptor (Interpro link)

    Interpro description:

    Proteins resident in the lumen of the endoplasmic reticulum (ER) contain a C-terminal tetrapeptide, commonly known as Lys-Asp-Glu-Leu (KDEL) in mammals and His-Asp-Glu-Leu (HDEL) in yeast (Saccharomyces cerevisiae) that acts as a signal for their retrieval from subsequent compartments of the secretory pathway. The receptor for this signal is a ~26 kDa Golgi membrane protein, initially identified as the ERD2 gene product in S. cerevisiae. The receptor molecule, known variously as the ER lumen protein retaining receptor or the 'KDEL receptor', is believed to cycle between the cis side of the Golgi apparatus and the ER. It has also been characterised in a number of other species, including plants, Plasmodium, Drosophila and mammals. In mammals, 2 highly related forms of the receptor are known.

    The KDEL receptor is a highly hydrophobic protein of 220 residues; its sequence exhibits 7 hydrophobic regions, all of which have been suggested to traverse the membrane. More recently, however, it has been suggested that only 6 of these regions are transmembrane (TM), resulting in both N- and C-termini on the cytoplasmic side of the membrane.

    Proteins where this domain is known:
    MAL13P1.163    PF13_0280   


    PTHR10585:SF13 - ER LUMEN PROTEIN RETAINING RECEPTOR (Panther link)

    Proteins where this domain is known:
    PF13_0280   


    PTHR10585:SF5 - PTHR10585:SF5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.163   


    PTHR10588 - PTHR10588 (Panther link)

    Proteins where this domain is known:
    MAL13P1.238    MAL8P1.46    PF10_0320    PF11_0476    PF14_0305    PF14_0496    PF14_0785    PFE0455w    PFI1470c    PFL1360c   


    PTHR10588:SF20 - gb def: Arabidopsis thaliana At5g22320/MWD9_11 (Panther link)

    Proteins where this domain is known:
    PF14_0496   


    PTHR10588:SF23 - PTHR10588:SF23 (Panther link)

    Proteins where this domain is known:
    PF10_0320   


    PTHR10588:SF26 - TESTIS SPECIFIC LEUCINE RICH REPEAT PROTEIN (Panther link)

    Proteins where this domain is known:
    PFL1360c   


    PTHR10588:SF28 - PTHR10588:SF28 (Panther link)

    Proteins where this domain is known:
    MAL8P1.46   


    PTHR10588:SF33 - PTHR10588:SF33 (Panther link)

    Proteins where this domain is known:
    PFI1470c   


    PTHR10588:SF6 - UNCHARACTERIZED (Panther link)

    Proteins where this domain is known:
    PFE0455w   


    PTHR10588:SF7 - UNCHARACTERIZED (Panther link)

    Proteins where this domain is known:
    MAL13P1.238   


    PTHR10589 - Peptidase_C12 (Panther link)

    Interpro entry IPR001578 : Peptidase C12, ubiquitin carboxyl-terminal hydrolase 1 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C12 (ubiquitin C-terminal hydrolase family, clan CA). Families within the CA clan are loosely termed papain-like as protein fold of the peptidase unit resembles that of papain, the type example for clan CA. The type example is the human ubiquitin C-terminal hydrolase UCH-L1.

    Ubiquitin is highly conserved, commonly found conjugated to proteins in eukaryotic cells, where it may act as a marker for rapid degradation, or it may have a chaperone function in protein assembly. The ubiquitin is released by cleavage from the bound protein by a protease. A number of deubiquitinising proteases are known: all are activated by thiol compounds, and inhibited by thiol-blocking agents and ubiquitin aldehyde, and as such have the properties of cysteine proteases.

    The deubiquitinsing proteases can be split into 2 size ranges (20-30 kDa and 100-200 kDa): this family are the 20-30 kDa ppeptides which includes the yeast yuh1. Yeast yuh1 protease is known to be active only against small ubiquitin conjugates, being inactive against conjugated beta-galactosidase. A mammalian homologue, UCH (ubiquitin conjugate hydrolase), is one of the most abundant proteins in the brain. Only one conserved cysteine can be identified, along with two conserved histidines. The spacing between the cysteine and the second histidine is thought to be more representative of the cysteine/histidine spacing of a cysteine protease catalytic dyad.

    Proteins where this domain is known:
    PF11_0177    PF14_0576   


    PTHR10589:SF16 - PTHR10589:SF16 (Panther link)

    Proteins where this domain is known:
    PF11_0177   


    PTHR10589:SF17 - PTHR10589:SF17 (Panther link)

    Proteins where this domain is known:
    PF14_0576   


    PTHR10593 - PTHR10593 (Panther link)

    Proteins where this domain is known:
    PFD0975w    PFL1490w   


    PTHR10593:SF1 - PTHR10593:SF1 (Panther link)

    Proteins where this domain is known:
    PFD0975w   


    PTHR10593:SF5 - PTHR10593:SF5 (Panther link)

    Proteins where this domain is known:
    PFL1490w   


    PTHR10598 - PTHR10598 (Panther link)

    Proteins where this domain is known:
    PF08_0021   


    PTHR10602 - PTHR10602 (Panther link)

    Proteins where this domain is known:
    PF07_0117   


    PTHR10615 - PTHR10615 (Panther link)

    Proteins where this domain is known:
    PF11_0192   


    PTHR10615:SF35 - PTHR10615:SF35 (Panther link)

    Proteins where this domain is known:
    PF11_0192   


    PTHR10619 - Factin_cap_beta (Panther link)

    Interpro entry IPR001698 : F-actin capping protein, beta subunit (Interpro link)

    Interpro description:

    The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.

    The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.

    The beta subunit is a protein of about 280 amino acid residues whose sequence is well conserved in eukaryotic species.

    Proteins where this domain is known:
    PFE0880c   


    PTHR10621 - PTHR10621 (Panther link)

    Proteins where this domain is known:
    PF10_0114   


    PTHR10623 - PTHR10623 (Panther link)

    Proteins where this domain is known:
    PFC0305w   


    PTHR10625 - His_deacetylse (Panther link)

    Interpro entry IPR000286 : (Interpro link)

    Interpro description:
    Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal of the acetyl group. Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are all members of this ancient protein superfamily.

    Proteins where this domain is known:
    PF10_0078    PF14_0690    PFE0328w    PFI1260c   


    PTHR10625:SF15 - HISTONE DEACETYLASE-RELATED (Panther link)

    Proteins where this domain is known:
    PF14_0690   


    PTHR10625:SF22 - PTHR10625:SF22 (Panther link)

    Proteins where this domain is known:
    PF10_0078   


    PTHR10625:SF28 - His_deacetylse_1 (Panther link)

    Interpro entry IPR003084 : Histone deacetylase (Interpro link)

    Interpro description:
    Histones can be reversibly acetylated on several lysine residues. Regulation of transcription is caused in part by this mechanism. Histone deacetylases catalyse the removal of the acetyl group. Histone deacetylases, acetoin utilization proteins and acetylpolyamine amidohydrolases are all members of this ancient protein superfamily.

    HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence-specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positivecharge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.

    HDAs function in multi-subunit complexes, reversing the acetylation of histones by histone acetyltransferases, and are also believed to deacetylate general transcription factors such as TFIIF and sequence- specific transcription factors such as p53. Thus, HDAs contribute to the regulation of transcription, in particular transcriptional repression. At N-terminal tails of histones, removal of the acetyl group from the epsilon-amino group of a lysine side chain will restore its positive charge, which may stabilise the histone-DNA interaction and prevent activating transcription factors binding to promoter elements. HDAs play important roles in the cell cycle and differentiation, and their deregulation can contribute to the development of cancer.

    Proteins where this domain is known:
    PFI1260c   


    PTHR10625:SF9 - HISTONE DEACETYLASE 11 (Panther link)

    Proteins where this domain is known:
    PFE0328w   


    PTHR10629 - C5_DNA_meth (Panther link)

    Interpro entry IPR001525 : C-5 cytosine-specific DNA methylase (Interpro link)

    Interpro description:
    C-5 cytosine-specific DNA methylases (C5 Mtase) are enzymes that specifically methylate the C-5 carbon of cytosines in DNA to produce C5-methylcytosine. In mammalian cells, cytosine-specific methyltransferases methylate certain CpG sequences, which are believed to modulate gene expression and cell differentiation. In bacteria, these enzymes are a component of restriction-modification systems and serve as valuable tools for the manipulation of DNA. The structure of HhaI methyltransferase (M.HhaI) has been resolved to 2.5 A: the molecule folds into 2 domains - a larger catalytic domain containing catalytic and cofactor binding sites, and a smaller DNA recognition domain.

    Proteins where this domain is known:
    MAL7P1.151    PFL1005c   


    PTHR10629:SF1 - gb def: Hypothetical protein T09A5.8 in chromosome III (Panther link)

    Proteins where this domain is known:
    PFL1005c   


    PTHR10629:SF8 - PTHR10629:SF8 (Panther link)

    Proteins where this domain is known:
    MAL7P1.151   


    PTHR10631 - TRM_mtfrase (Panther link)

    Interpro entry IPR002905 : N2,N2-dimethylguanosine tRNA methyltransferase (Interpro link)

    Interpro description:
    This enzymeuses S-adenosyl-L-methionine to methylate tRNA:
     S-AdoMet + tRNA = S-adenosyl-L-homocysteine + tRNA containing N2-methylguanine
    The TRM1 gene of Saccharomyces cerevisiae is necessary for the N2,N2-dimethylguanosine modification of both mitochondrial and cytoplasmic tRNAs. The enzyme is found in both eukaryotes and archaea.

    Proteins where this domain is known:
    PF13_0109   


    PTHR10634 - PTHR10634 (Panther link)

    Proteins where this domain is known:
    PF08_0056   


    PTHR10634:SF5 - PTHR10634:SF5 (Panther link)

    Proteins where this domain is known:
    PF08_0056   


    PTHR10635 - PTHR10635 (Panther link)

    Proteins where this domain is known:
    PF14_0277   


    PTHR10641 - Myb_transfac (Panther link)

    Interpro entry IPR015495 : (Interpro link)

    Interpro description:

    The Myb gene family became a topic of interest following the discovery of the v-Myb avian retroviral oncogene and its cellular homologue, c-Myb. Mammals, birds, and amphibians were all found to contain three different Myb-related genes. The vertebrate Myb proteins are nuclear and can bind to the same specific DNA sequences YAAC(G/T)G. The Myb domain is included in the SANT domain family, and is a conserved region consisting of three tandem repeats. A-Myb and c-Myb encode tissue-specific transcriptional activators. In contrast, B-Myb appears to be essential in all dividing cells and transcriptional activation appears unlikely to be its primary physiologic function. Only single Myb genes are present in invertebrates such as the sea urchin (Strongylocentrotus purpuratus (Purple sea urchin) and Drosophila melanogaster and these genes most closely resemble vertebrate B-Myb. Myb-related transcription factor gene has been isolated from the cellular slime mold, Dictyostelium discoideum (Slime mold). While Myb repeat containing transcription factors are highly represented in plants, with more than two hundred proteins represented in maize and over one hundred present in Arabidopsis thaliana (Mouse-ear cress). Other Myb-related proteins include factors essential components of the mRNA splicing machinery such as CDC5/CEF1.

    Proteins where this domain is known:
    PF10_0327   


    PTHR10641:SF17 - CELL DIVISION CONTROL PROTEIN 5 (Panther link)

    Proteins where this domain is known:
    PF10_0327   


    PTHR10644 - PTHR10644 (Panther link)

    Proteins where this domain is known:
    PFC0780w    PFL1680w   


    PTHR10644:SF1 - PTHR10644:SF1 (Panther link)

    Proteins where this domain is known:
    PFL1680w   


    PTHR10644:SF3 - PTHR10644:SF3 (Panther link)

    Proteins where this domain is known:
    PFC0780w   


    PTHR10648 - PTHR10648 (Panther link)

    Proteins where this domain is known:
    MAL13P1.105   


    PTHR10652 - CAP (Panther link)

    Interpro entry IPR001837 : Adenylate cyclase-associated CAP (Interpro link)

    Interpro description:

    Cyclase-associated proteins (CAPs) are highly conserved actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways. In Drosophila, CAP functions in Hedgehog-mediated eye development and in establishing oocyte polarity. In Dictyostelium (slim mold), CAP is involved in microfilament reorganisation near the plasma membrane in a PIP2-regulated manner and is required to perpetuate the cAMP relay signal to organise fruitbody formation. In plants, CAP is involved in plant signalling pathways required for co-ordinated organ expansion. In yeast, CAP is involved in adenylate cyclase activation, as well as in vesicle trafficking and endocytosis. In both yeast and mammals, CAPs appear to be involved in recycling G-actin monomers from ADF/cofilins for subsequent rounds of filament assembly. In mammals, there are two different CAPs (CAP1 and CAP2) that share 64% amino acid identity.

    All CAPs appear to contain a C-terminal actin-binding domain that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes. CAP directly regulates actin filament dynamics and has been implicated in a number of complex developmental and morphological processes, including mRNA localisation and the establishment of cell polarity. Actin exists both as globular (G) (monomeric) actin subunits and assembled into filamentous (F) actin. In cells, actin cycles between these two forms. Proteins that bind F-actin often regulate F-actin assembly and its interaction with other proteins, while proteins that interact with G-actin often control the availability of unpolymerised actin. CAPs bind G-actin.

    In addition to actin-binding, CAPs can have additional roles, and may act as bifunctional proteins. In Saccharomyces cerevisiae (Baker's yeast), CAP is a component of the adenylyl cyclase complex (Cyr1p) that serves as an effector of Ras during normal cell signalling. S. cerevisiae CAP functions to expose adenylate cyclase binding sites to Ras, thereby enabling adenylate cyclase to be activated by Ras regulatory signals. In Schizosaccharomyces pombe (Fission yeast), CAP is also required for adenylate cyclase activity, but not through the Ras pathway. In both organisms, the N-terminal domain is responsible for adenylate cyclase activation, but the S cerevisiae and S. pombe N-termini cannot complement one another. Yeast CAPs are unique among the CAP family of proteins, because they are the only ones to directly interact with and activate adenylate cyclase. S. cerevisiae CAP has four major domains. In addition to the N-terminal adenylate cyclase-interacting domain, and the C-terminal actin-binding domain, it possesses two other domains: a proline-rich domain that interacts with Src homology 3 (SH3) domains of specific proteins, and a domain that is responsible for CAP oligomerisation to form multimeric complexes (although oligomerisation appears to involve the N- and C-terminal domains as well). The proline-rich domain interacts with profilin, a protein that catalyses nucleotide exchange on G-actin monomers and promotes addition to barbed ends of filamentous F-actin. Since CAP can bind profilin via a proline-rich domain, and G-actin via a C-terminal domain, it has been suggested that a ternary G-actin/CAP/profilin complex could be formed.

    This entry represents CAP proteins from various organisms.

    Proteins where this domain is known:
    PFA0260c   


    PTHR10653 - F-actin_cap_A (Panther link)

    Interpro entry IPR002189 : F-actin capping protein, alpha subunit (Interpro link)

    Interpro description:

    The actin filament system, a prominent part of the cytoskeleton in eukaryotic cells, is both a static structure and a dynamic network that can undergo rearrangements: it is thought to be involved in processes such as cell movement and phagocytosis, as well as muscle contraction.

    The F-actin capping protein binds in a calcium-independent manner to the fast growing ends of actin filaments (barbed end) thereby blocking the exchange of subunits at these ends. Unlike gelsolin and severin this protein does not sever actin filaments. The F-actin capping protein is a heterodimer composed of two unrelated subunits: alpha and beta. Neither of the subunits shows sequence similarity to other filament-capping proteins.

    The alpha subunit is a protein of about 268 to 286 amino acid residues whose sequence is well conserved in eukaryotic species.

    Proteins where this domain is known:
    PFE1420w   


    PTHR10658 - PI_transfer (Panther link)

    Interpro entry IPR001666 : Phosphatidylinositol transfer protein (Interpro link)

    Interpro description:
    Phosphatidylinositol transfer protein (PITP) is a ubiquitous cytosolic protein, thought to be involved in transport of phospholipids from their site of synthesis in the endoplasmic reticulum and Golgi to other cell membranes. More recently, PITP has been shown to be an essential component of the polyphosphoinositide synthesis machinery and is hence required for proper signalling by epidermal growth factor and f-Met-Leu-Phe, as well as for exocytosis. The role of PITP in polyphosphoinositide synthesis may also explain its involvement in intracellular vesicular traffic.

    Proteins where this domain is known:
    MAL13P1.256   


    PTHR10658:SF1 - PTHR10658:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.256   


    PTHR10660 - Proteasome_activ_REG_asu/bsu (Panther link)

    Interpro entry IPR009077 : Proteasome activator pa28, REG alpha/beta subunit (Interpro link)

    Interpro description:

    The 20S proteasome is a multicatalytic complex that is responsible for the non-lysosomal degradation of intracellular proteins. The proteasome is composed of a catalytic core that is regulated by protein complexes, which bind to the ends of the cylindrical core structure. One of these regulatory complexes is the PA28 activator complex (also known as the 11S regulator, or REG), a ring-shaped hexameric structure that enhances the peptidase activity of the core enzyme. Three REG subunits have been isolated, REGalpha, REGbeta and REGgamma. REGalpha and REGbeta preferentially form a heteromeric complex with alternating alpha and beta subunits. The structure of the human REGalpha subunit reveals a heptameric barrel-shaped assembly containing a central channel. The binding of REG is thought to create a pore through with substrates and products can pass.

    Proteins where this domain is known:
    PFI0370c   


    PTHR10660:SF3 - PTHR10660:SF3 (Panther link)

    Proteins where this domain is known:
    PFI0370c   


    PTHR10663 - PTHR10663 (Panther link)

    Proteins where this domain is known:
    PF14_0407   


    PTHR10663:SF8 - PTHR10663:SF8 (Panther link)

    Proteins where this domain is known:
    PF14_0407   


    PTHR10666 - PTHR10666 (Panther link)

    Proteins where this domain is known:
    MAL13P1.64    PF13_0084    PF13_0346    PF14_0027    PFL0585w   


    PTHR10666:SF11 - PTHR10666:SF11 (Panther link)

    Proteins where this domain is known:
    MAL13P1.64   


    PTHR10666:SF2 - PTHR10666:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0027   


    PTHR10666:SF9 - UBIQUITIN (RIBOSOMAL PROTEIN L40) (Panther link)

    Proteins where this domain is known:
    PF13_0346    PFL0585w   


    PTHR10670 - PTHR10670 (Panther link)

    Proteins where this domain is known:
    PFF1470c   


    PTHR10676 - PTHR10676 (Panther link)

    Proteins where this domain is known:
    MAL7P1.162    MAL7P1.89    PF10_0224    PF11_0240    PF14_0626    PFI0260c   


    PTHR10676:SF23 - PTHR10676:SF23 (Panther link)

    Proteins where this domain is known:
    PF14_0626   


    PTHR10676:SF24 - DYNEIN HEAVY CHAIN (Panther link)

    Proteins where this domain is known:
    MAL7P1.89   


    PTHR10676:SF31 - PTHR10676:SF31 (Panther link)

    Proteins where this domain is known:
    PF11_0240   


    PTHR10676:SF32 - PTHR10676:SF32 (Panther link)

    Proteins where this domain is known:
    PF10_0224   


    PTHR10676:SF34 - PTHR10676:SF34 (Panther link)

    Proteins where this domain is known:
    MAL7P1.162   


    PTHR10676:SF35 - PTHR10676:SF35 (Panther link)

    Proteins where this domain is known:
    PFI0260c   


    PTHR10677 - Ubiquilin (Panther link)

    Interpro entry IPR015496 : (Interpro link)

    Interpro description:

    Ubiquitin is a protein of seventy six amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. It is widely known as a post-translational tag used to signal a protein's hydrolytic destruction. Other functions for ubiquitin, depend on its differential internal isopeptide linkages. In addition, several ubiquitin-like proteins have been discovered from genome-sequencing efforts, other structural studies, and genetic screens. These new data show that proteins with the ubiquitin domain are adaptable, transposable genetic elements, which have been appended to other genes and utilised for many different cellular functions, depending on the ubiquitin-like protein's identity, subcellular location, and method of covalent attachment. The post-translational ligation of proteins to members of the ubiquitin superfamily can signal many different fates for the target protein.

    Ubiquitin is a globular protein, the last four C-terminal residues (Leu-Arg-Gly-Gly) extending from the compact structure to form a 'tail' important for its function. The latter is mediated by the covalent conjugation of ubiquitin to target proteins, by an isopeptide linkage between the C-terminal glycine and the epsilon amino group of lysine residues in the target proteins.

    Ubiquilin is a Ubiquitin-like (UBL) protein and has an N-terminal UBL domain and a C-terminal Ub-associated (UBA) domain in its structure.

    Proteins where this domain is known:
    PF11_0142    PF11_0329   


    PTHR10678 - PTHR10678 (Panther link)

    Proteins where this domain is known:
    PF14_0025   


    PTHR10678:SF1 - PTHR10678:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0025   


    PTHR10681 - PEROXIREDOXIN (Panther link)

    Proteins where this domain is known:
    PF08_0131    PF10_0268    PF14_0368    PFL0725w   


    PTHR10681:SF11 - gb def: 1-cys peroxidoxin (Panther link)

    Proteins where this domain is known:
    PF08_0131   


    PTHR10681:SF5 - PTHR10681:SF5 (Panther link)

    Proteins where this domain is known:
    PF10_0268   


    PTHR10681:SF8 - PEROXIREDOXINS, PRX-1, PRX-2, PRX-3 (Panther link)

    Proteins where this domain is known:
    PF14_0368    PFL0725w   


    PTHR10682 - PTHR10682 (Panther link)

    Proteins where this domain is known:
    PFF1240w   


    PTHR10682:SF5 - PTHR10682:SF5 (Panther link)

    Proteins where this domain is known:
    PFF1240w   


    PTHR10694 - PTHR10694 (Panther link)

    Proteins where this domain is known:
    MAL8P1.111   


    PTHR10694:SF8 - PTHR10694:SF8 (Panther link)

    Proteins where this domain is known:
    MAL8P1.111   


    PTHR10695 - Depp_CoAkinase (Panther link)

    Interpro entry IPR001977 : Dephospho-CoA kinase (Interpro link)

    Interpro description:

    This family contains dephospho-CoA kinases, which catalyzes the final step in CoA biosynthesis, the phosphorylation of the 3'-hydroxyl group of ribose using ATP as a phosphate donor.

    The crystal structures of a number of the proteins in this entry have been determined, including the structure of the protein from Haemophilus influenzae to 2.0-A resolution in a comlex with ATP. The protein consists of three domains: the nucleotide-binding domain with a five-stranded parallel beta-sheet, the substrate-binding alpha-helical domain, and the lid domain formed by a pair of alpha-helices; the overall topology of the protein resembles the structures of other nucleotide kinases.

    Proteins where this domain is known:
    PF14_0415   


    PTHR10698 - ATPase_V1_H (Panther link)

    Interpro entry IPR004908 : ATPase, V1 complex, subunit H (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    This entry represents subunit H (also known as Vma13p) found in the V1 complex of V-ATPases. This subunit has a regulatory function, being responsible for activating ATPase activity and coupling ATPase activity to proton flow. The yeast enzyme contains five motifs similar to the HEAT or Armadillo repeats seen in the importins, and can be divided into two distinct domains: a large N-terminal domain consisting of stacked alpha helices, and a smaller C-terminal alpha-helical domain with a similar superhelical topology to an armadillo repeat.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0034   


    PTHR10701 - PTHR10701 (Panther link)

    Proteins where this domain is known:
    PF14_0146   


    PTHR10715 - Ribosomal_L6E (Panther link)

    Interpro entry IPR000915 : Ribosomal protein L6E (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes mammalian ribosomal protein L6 (L6 was previously known as TAX-responsive enhancer element binding protein 107); Caenorhabditis elegans ribosomal protein L6 (R151.3); Saccharomyces cerevisiae (Baker's yeast) ribosomal protein YL16A/YL16B; and Mesembryanthemum crystallinum (Common ice plant) ribosomal protein YL16-like. These proteins have 175 (yeast) to 287 (mammalian) amino acids.

    Proteins where this domain is known:
    PF13_0213   


    PTHR10721 - PTHR10721 (Panther link)

    Proteins where this domain is known:
    PF11_0265   


    PTHR10721:SF1 - PTHR10721:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0265   


    PTHR10722 - PTHR10722 (Panther link)

    Proteins where this domain is known:
    PFF0700c   


    PTHR10730 - PTHR10730 (Panther link)

    Proteins where this domain is known:
    PFI1465w   


    PTHR10730:SF1 - PTHR10730:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1465w   


    PTHR10732 - Ribosomal_S17E (Panther link)

    Interpro entry IPR001210 : Ribosomal protein S17e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped in this family of ribosomal proteins, S17e. They include, vertebrate, Drosophila and Neurospora crassa (crp-3) S17's as well as yeast S17a (RP51A) and S17b (RP51B) and archaebacterial S17e.

    Proteins where this domain is known:
    PFL2055w   


    PTHR10738 - Skb1_mtfrase (Panther link)

    Interpro entry IPR007857 : Skb1 methyltransferase (Interpro link)

    Interpro description:
    The human homologue of Saccharomyces cerevisiae Skb1 (Shk1 kinase-binding protein 1) is a protein methyltransferase. These proteins seem to play a role in Jak signalling.

    Proteins where this domain is known:
    PF13_0323   


    PTHR10739 - PTHR10739 (Panther link)

    Proteins where this domain is known:
    MAL13P1.86    PF13_0253   


    PTHR10739:SF13 - PTHR10739:SF13 (Panther link)

    Proteins where this domain is known:
    MAL13P1.86   


    PTHR10739:SF14 - PTHR10739:SF14 (Panther link)

    Proteins where this domain is known:
    PF13_0253   


    PTHR10742 - AMINE OXIDASE (Panther link)

    Proteins where this domain is known:
    PF10_0275   


    PTHR10743 - Rer1 (Panther link)

    Interpro entry IPR004932 : Retrieval of early ER protein Rer1 (Interpro link)

    Interpro description:

    RER1 family proteins are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex.

    Proteins where this domain is known:
    PFI0150c   


    PTHR10744 - Ribosomal_S17 (Panther link)

    Interpro entry IPR000266 : Ribosomal protein S17 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The ribosomal proteins catalyse ribosome assembly and stabilise the rRNA, tuning the structure of the ribosome for optimal function. Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S17 is known to bind specifically to the 5' end of 16S ribosomal RNA in Escherichia coli (primary rRNA binding protein), and is thought to be involved in the recognition of termination codons. Experimental evidence has revealed that S17 has virtually no groups exposed on the ribosomal surface.

    Proteins where this domain is known:
    MAL13P1.327    PFC0775w   


    PTHR10744:SF2 - PTHR10744:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0775w   


    PTHR10745 - tRNA-synt_gly (Panther link)

    Interpro entry IPR018160 : Glycyl-tRNA synthetase, alpha2 dimer, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    In eubacteria, glycyl-tRNA synthetase is an alpha2/beta2 tetramer composed of 2 different subunits. In some eubacteria, in archaea and eukaryota, glycyl-tRNA synthetase is an alpha2 dimer, this family. It belongs to class IIc and is one of the most complex synthetases. What is most interesting is the lack of similarity between the two types: divergence at the sequence level is so great that it is impossible to infer descent from common genes. The alpha (see and beta subunits (see also lack significant sequence similarity. However, they are translated from a single mRNA, and a single chain glycyl-tRNA synthetase from Chlamydia trachomatis has been found to have significant similarity with both domains, suggesting divergence from a single polypeptide chain.

    The sequence and crystal structure of the homodimeric glycyl-tRNA synthetase from Thermus thermophilus, shows that each monomer consists of an active site strongly resembling that of the aspartyl and seryl enzymes, a C-terminal anticodon recognition domain of 100 residues and a third domain unusually inserted between motifs 1 and 2 almost certainly interacting with the acceptor arm of tRNA(Gly). The C-terminal domain has a novel five-stranded parallel-antiparallel beta-sheet structure with three surrounding helices. The active site residues most probably responsible for substrate recognition, in particular in the Gly binding pocket, can be identified by inference from aspartyl-tRNA synthetase due to the conserved nature of the class II active site.

    Proteins where this domain is known:
    PF14_0198   


    PTHR10746 - Ribos_L4_L1E_bac (Panther link)

    Interpro entry IPR013005 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family includes ribosomal L4/L1 from bacteria, chloroplasts and mitochondria. The L4 protein from yeast has been shown to bind rRNA.

    Proteins where this domain is known:
    PF08_0038   


    PTHR10746:SF1 - PTHR10746:SF1 (Panther link)

    Proteins where this domain is known:
    PF08_0038   


    PTHR10755 - Coprogen_oxidas (Panther link)

    Interpro entry IPR001260 : Coproporphyrinogen III oxidase (Interpro link)

    Interpro description:
    Coprogen oxidase (i.e. coproporphyrin III oxidase or coproporphyrinogenase) catalyses the oxidative decarboxylation of coproporphyrinogen III to proto-porhyrinogen IX in the haem and chlorophyll biosynthetic pathways. The protein is a homodimer containing two internally bound iron atoms per molecule of native protein . The enzyme is active in the presence of molecular oxygen that acts as an electron acceptor). The enzyme is widely distributed having been found in a variety of eukaryotic and prokaryotic sources.

    Proteins where this domain is known:
    PF11_0436   


    PTHR10758 - PTHR10758 (Panther link)

    Proteins where this domain is known:
    MAL13P1.190   


    PTHR10759 - Ribosomal_L34E (Panther link)

    Interpro entry IPR008195 : Ribosomal protein L34e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins belong to the L34e family. These include, vertebrate L34, mosquito L31, plant L34, yeast putative ribosomal protein YIL052c and archaebacterial L34e.

    Proteins where this domain is known:
    PF07_0043   


    PTHR10762 - Diphthamide_syn (Panther link)

    Interpro entry IPR002728 : (Interpro link)

    Interpro description:
    Members of this family include a candidate tumour suppressor gene, and DPH2 from yeast which confers resistance to diphtheria toxin and has been found to be involved in diphthamide synthesis. Diphtheria toxin inhibits eukaryotic protein synthesis by ADP-ribosylating diphthamide, a posttranslationally modified histidine residue present in EF2. The exact function of the members of this family is unknown.

    Proteins where this domain is known:
    PF14_0136    PF14_0274   


    PTHR10762:SF1 - PTHR10762:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0136   


    PTHR10762:SF2 - DPH2 (Panther link)

    Interpro entry IPR010014 : (Interpro link)

    Interpro description:

    This protein has been shown in Saccharomyces cerevisiae (Baker's yeast) to be one of several required for the modification of a particular histidine residue of translation elongation factor 2 to diphthamide. This modified site can then become the target for ADP-ribosylation by diphtheria toxin.

    Proteins where this domain is known:
    PF14_0274   


    PTHR10763 - ORIGIN OF REPLICATION BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    PF11_0304    PFE0155w    PFL0150w   


    PTHR10763:SF1 - PTHR10763:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0304   


    PTHR10763:SF6 - ORIGIN RECOGNITION COMPLEX SUBUNIT 1 (Panther link)

    Proteins where this domain is known:
    PFL0150w   


    PTHR10766 - EMP70 (Panther link)

    Interpro entry IPR004240 : Nonaspanin (TM9SF) (Interpro link)

    Interpro description:
    The transmembrane 9 superfamily protein (TM9SF) may function as a channel or small molecule transporter. Proteins in this group are endosomal integral membrane proteins.

    Proteins where this domain is known:
    PF10_0208   


    PTHR10766:SF8 - PTHR10766:SF8 (Panther link)

    Proteins where this domain is known:
    PF10_0208   


    PTHR10768 - Ribosomal_L37E (Panther link)

    Interpro entry IPR001569 : Ribosomal protein L37e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins of 56 to 96 amino-acid residues that share a highly conserved region located in the N-terminal part.

    Proteins where this domain is known:
    MAL7P1.320   


    PTHR10769 - Ribosomal_S28e (Panther link)

    Interpro entry IPR000289 : Ribosomal protein S28e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. Examples are:

  • Mammalian S28
  • Plant S28
  • Fungi S33
  • Archaebacterial S28e.
  • These proteins have from 64 to 78 amino acids and a highly conserved C-terminal extremity region.

    Proteins where this domain is known:
    PF14_0585   


    PTHR10769:SF1 - PTHR10769:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0585   


    PTHR10772 - Chaprnin_Cpn10 (Panther link)

    Interpro entry IPR001476 : Chaperonin Cpn10 (Interpro link)

    Interpro description:

    The chaperonins are 'helper' molecules required for correct folding and subsequent assembly of some proteins . These are required for normal cell growth, and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions. Type I chaperonins present in eubacteria, mitochondria and chloroplasts require the concerted action of 2 proteins, chaperonin 60 (cpn60) and chaperonin 10 (cpn10).

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between six to eight identical subunits, while the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The central cavity of the cylindrical cpn60 tetradecamer provides as isolated environment for protein folding whilst cpn-10 binds to cpn-60 and synchronizes the release of the folded protein in an Mg2+-ATP dependent manner. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

    Escherichia coli GroES has also been shown to bind ATP cooperatively, and with an affinity comparable to that of GroEL. Each GroEL subunit contains three structurally distinct domains: an apical, an intermediate and an equatorial domain. The apical domain contains the binding sites for both GroES and the unfolded protein substrate. The equatorial domain contains the ATP-binding site and most of the oligomeric contacts. The intermediate domain links the apical and equatorial domains and transfers allosteric information between them. The GroEL oligomer is a tetradecamer, cylindrically shaped, that is organised in two heptameric rings stacked back to back. Each GroEL ring contains a central cavity, known as the 'Anfinsen cage', that provides an isolated environment for protein folding. The identical 10 kDa subunits of GroES form a dome-like heptameric oligomer in solution. ATP binding to GroES may be important in charging the seven subunits of the interacting GroEL ring with ATP, to facilitate cooperative ATP binding and hydrolysis for substrate protein release.

    Proteins where this domain is known:
    PF13_0180    PFL0740c   


    PTHR10773 - RNA_polK_14kDa (Panther link)

    Interpro entry IPR006111 : DNA-directed RNA polymerase, 14 to 18 kDa subunit (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    A component of 14 to 18 kDa shared by all three forms of eukaryotic RNA polymerases and which has been sequenced in budding yeast (gene RPB6 or RPO26), in fission yeast (gene rpb6 or rpo15), in human and in African swine fever virus (ASFV) is evolutionary related to archaeal subunit K (gene rpoK). The archaeal protein is colinear with the C-terminal part of the eukaryotic subunit.

    Proteins where this domain is known:
    PFC0155c   


    PTHR10773:SF1 - PTHR10773:SF1 (Panther link)

    Proteins where this domain is known:
    PFC0155c   


    PTHR10778 - UDP-GALACTOSE TRANSPORTER-RELATED (Panther link)

    Proteins where this domain is known:
    PF11_0141   


    PTHR10778:SF10 - UDP-GALACTOSE TRANSPORTER B1 (SLC35B1) (Panther link)

    Proteins where this domain is known:
    PF11_0141   


    PTHR10779 - PTHR10779 (Panther link)

    Proteins where this domain is known:
    PF10_0195   


    PTHR10782 - PTHR10782 (Panther link)

    Proteins where this domain is known:
    MAL13P1.302   


    PTHR10783 - PTHR10783 (Panther link)

    Proteins where this domain is known:
    PFF0365c    PFF1075w    PFL1455w   


    PTHR10783:SF3 - PTHR10783:SF3 (Panther link)

    Proteins where this domain is known:
    PFL1455w   


    PTHR10783:SF4 - PTHR10783:SF4 (Panther link)

    Proteins where this domain is known:
    PFF0365c   


    PTHR10783:SF5 - PTHR10783:SF5 (Panther link)

    Proteins where this domain is known:
    PFF1075w   


    PTHR10784 - eIF6 (Panther link)

    Interpro entry IPR002769 : Translation initiation factor IF6 (Interpro link)

    Interpro description:

    This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.

    The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.

    Proteins where this domain is known:
    PF13_0178   


    PTHR10791 - MtN3_slv_TM (Panther link)

    Interpro entry IPR004316 : MtN3 and saliva related transmembrane protein (Interpro link)

    Interpro description:

    This family includes proteins such as Drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family.

    Proteins where this domain is known:
    PF07_0014    PF14_0347    PFB0760w    PFC0530w   


    PTHR10791:SF1 - UCP_MAL3P4.7 (Panther link)

    Interpro entry IPR018172 : (Interpro link)

    Interpro description:

    This entry represents a group of uncharacterised proteins that appear to be related to nodulin MtN3 and saliva related transmembrane protein.

    Proteins where this domain is known:
    PF14_0347    PFC0530w   


    PTHR10791:SF2 - UCP_PY00680 (Panther link)

    Interpro entry IPR018173 : (Interpro link)

    Interpro description:

    This entry represents a group of uncharacterised proteins that appear to be related to nodulin MtN3 and saliva related transmembrane protein.

    Proteins where this domain is known:
    PF07_0014   


    PTHR10791:SF6 - RAG1-activate_prot-1 (Panther link)

    Interpro entry IPR018179 : RAG1-activating protein 1 homologue (Interpro link)

    Interpro description:

    This entry represents RAG1 (recombination activating genes 1)-activating protein 1 homologue. Expression of recombination activating genes (RAG) involved in the V (D) J recombination is regulated by the RAG1 gene activator (RGA) in mammals.

    Proteins where this domain is known:
    PFB0760w   


    PTHR10792 - Ribosomal_L24E (Panther link)

    Interpro entry IPR000988 : Ribosomal protein L24e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeabacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of mammalian ribosomal protein L24; yeast ribosomal protein L30A/B (Rp29) (YL21); Kluyveromyces lactis ribosomal protein L30; Arabidopsis thaliana ribosomal protein L24 homolog; Haloarcula marismortui ribosomal protein HL21/HL22; and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ1201. These proteins have 60 to 160 amino-acid residues.

    Proteins where this domain is known:
    PF13_0049    PFE0300c   


    PTHR10792:SF1 - PTHR10792:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0049   


    PTHR10792:SF2 - PTHR10792:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0300c   


    PTHR10795 - SubtilSerProt (Panther link)

    Interpro entry IPR015500 : (Interpro link)

    Interpro description:

    Limited proteolysis of most large protein precursors is carried out in vivo by the subtilisin-like pro-protein convertases. Many important biological processes such as peptide hormone synthesis, viral protein processing and receptor maturation involve proteolytic processing by these enzymes. The subtilisin-serine protease (SRSP) family hormone and pro-protein convertases (furin, PC1/3, PC2, PC4, PACE4, PC5/6, and PC7/7/LPC) act within the secretory pathway to cleave polypeptide precursors at specific basic sites, generating their biologically active forms. Serum proteins, pro-hormones, receptors, zymogens, viral surface glycoproteins, bacterial toxins and others are activated by this route. The SRSPs share the same domain structure, including a signal peptide, the pro-peptide, the catalytic domain, the P/middle or homo B domain, and the C-terminus.

    Proteins where this domain is known:
    PF11_0381    PFE0355c    PFE0370c   


    PTHR10795:SF26 - PTHR10795:SF26 (Panther link)

    Proteins where this domain is known:
    PFE0370c   


    PTHR10795:SF31 - PTHR10795:SF31 (Panther link)

    Proteins where this domain is known:
    PF11_0381   


    PTHR10795:SF32 - PTHR10795:SF32 (Panther link)

    Proteins where this domain is known:
    PFE0355c   


    PTHR10796 - PTHR10796 (Panther link)

    Proteins where this domain is known:
    PFA0375c   


    PTHR10796:SF33 - PTHR10796:SF33 (Panther link)

    Proteins where this domain is known:
    PFA0375c   


    PTHR10797 - CCR4-ASSOCIATED FACTOR (Panther link)

    Proteins where this domain is known:
    MAL8P1.104   


    PTHR10799 - PTHR10799 (Panther link)

    Proteins where this domain is known:
    MAL13P1.216    MAL8P1.65    PF08_0048    PF08_0126    PF10_0232    PF11_0053    PF13_0308    PFB0730w    PFB0745w    PFF0225w    PFF1185w    PFL2110c    PFL2440w   


    PTHR10799:SF43 - PTHR10799:SF43 (Panther link)

    Proteins where this domain is known:
    PF13_0308    PFF0225w   


    PTHR10799:SF62 - PTHR10799:SF62 (Panther link)

    Proteins where this domain is known:
    MAL13P1.216    PFL2440w   


    PTHR10799:SF66 - PTHR10799:SF66 (Panther link)

    Proteins where this domain is known:
    PF08_0126   


    PTHR10799:SF69 - PTHR10799:SF69 (Panther link)

    Proteins where this domain is known:
    MAL8P1.65    PFF1185w   


    PTHR10799:SF70 - PTHR10799:SF70 (Panther link)

    Proteins where this domain is known:
    PF10_0232   


    PTHR10799:SF73 - PTHR10799:SF73 (Panther link)

    Proteins where this domain is known:
    PF11_0053   


    PTHR10799:SF75 - PTHR10799:SF75 (Panther link)

    Proteins where this domain is known:
    PF08_0048    PFB0745w   


    PTHR10799:SF76 - PTHR10799:SF76 (Panther link)

    Proteins where this domain is known:
    PFB0730w   


    PTHR10802 - MITOCHONDRIAL IMPORT RECEPTOR SUBUNIT TOM40 (Panther link)

    Proteins where this domain is known:
    PFF0825c   


    PTHR10803 - PTHR10803 (Panther link)

    Proteins where this domain is known:
    PFD0725c   


    PTHR10804 - Peptidase_M24_cat_core (Panther link)

    Interpro entry IPR000994 : Peptidase M24, structural domain (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.

    The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.

    The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.

    Proteins where this domain is known:
    MAL8P1.140    PF10_0150    PF14_0261    PF14_0327    PF14_0517    PFE1360c   


    PTHR10804:SF1 - PTHR10804:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0517   


    PTHR10804:SF11 - PTHR10804:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0261   


    PTHR10804:SF13 - Pept_M24A_MAP1 (Panther link)

    Interpro entry IPR002467 : Peptidase M24A, methionine aminopeptidase, subfamily 1 (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.

    Methionine aminopeptidase (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist . While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP, to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2.

    Proteins where this domain is known:
    MAL8P1.140    PF10_0150    PFE1360c   


    PTHR10804:SF9 - Pept_M24A_MAP2 (Panther link)

    Interpro entry IPR002468 : Peptidase M24A, methionine aminopeptidase, subfamily 2 (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to MEROPS peptidase family M24 (clan MG), subfamily M24A.

    Methionine aminopeptidase (MAP) is responsible for the removal of the amino-terminal (initiator) methionine from nascent eukaryotic cytosolic and cytoplasmic prokaryotic proteins if the penultimate amino acid is small and uncharged. All MAP studied to date are monomeric proteins that require cobalt ions for activity. Two subfamilies of MAP enzymes are known to exist . While being evolutionary related, they only share a limited amount of sequence similarity mostly clustered around the residues shown, in the Escherichia coli MAP, to be involved in cobalt-binding. The first family consists of enzymes from prokaryotes as well as eukaryotic MAP-1, while the second group is made up of archaeal MAP and eukaryotic MAP-2 and includes proteins which do not seem to be MAP, but that are clearly evolutionary related such as mouse proliferation-associated protein 1 and fission yeast curved DNA-binding protein.

    Proteins where this domain is known:
    PF14_0327   


    PTHR10805 - Coatomer_E (Panther link)

    Interpro entry IPR006822 : Coatomer, epsilon subunit (Interpro link)

    Interpro description:

    Proteins synthesised on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer. While clathrin mediates endocytic protein transport, and transport from ER to Golgi, coatomers primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. For example, the coatomer COP1 (coat protein complex 1) is responsible for reverse transport of recycled proteins from Golgi and pre-Golgi compartments back to the ER, while COPII buds vesicles from the ER to the Golgi. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Activated small guanine triphosphatases (GTPases) attract coat proteins to specific membrane export sites, thereby linking coatomers to export cargos. As coat proteins polymerise, vesicles are formed and budded from membrane-bound organelles. Coatomer complexes also influence Golgi structural integrity, as well as the processing, activity, and endocytic recycling of LDL receptors. In mammals, coatomer complexes can only be recruited by membranes associated to ADP-ribosylation factors (ARFs), which are small GTP-binding proteins. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.

    This entry represents the epsilon subunit of the coatomer complex, which is involved in the regulation of intracellular protein trafficking between the endoplasmic reticulum and the Golgi complex.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    MAL8P1.121   


    PTHR10806 - Peptidase_S26B (Panther link)

    Interpro entry IPR001733 : Peptidase S26B, eukaryotic signal peptidase (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26B.

    Eukaryotic microsomal signal peptidase is involved in the removal of signal peptides from secretory proteins as they pass into the endoplasmic reticulum lumen. The peptidase is more complex than its mitochondrial and bacterial counterparts, containing a number of subunits, ranging from two in the chicken oviduct peptidase, to five in the dog pancreas protein. They share sequence similarity with the bacterial leader peptidases (family S26A), although activity here is mediated by a serine/histidine dyad rather than a serine/lysine dyad. Archaeal signal peptidases also belong to this group.

    Proteins where this domain is known:
    MAL13P1.167   


    PTHR10809 - PTHR10809 (Panther link)

    Proteins where this domain is known:
    PF14_0377   


    PTHR10811 - PTHR10811 (Panther link)

    Proteins where this domain is known:
    PFC0435w   


    PTHR10811:SF2 - PTHR10811:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0435w   


    PTHR10827 - RETICULOCALBIN (Panther link)

    Proteins where this domain is known:
    PF11_0098   


    PTHR10830 - DDOST_48kDa (Panther link)

    Interpro entry IPR005013 : Dolichyl-diphosphooligosaccharide-protein glycosyltransferase 48kDa subunit (Interpro link)

    Interpro description:

    Members of this family are involved in asparagine-linked protein glycosylation. In particular, dolichyl-diphosphooligosaccharide-protein glycosyltransferase (DDOST), also known as oligosaccharyltransferase, transfers the high-mannose sugar GlcNAc(2)-Man(9)-Glc(3) from a dolichol-linked donor to an asparagine acceptor in a consensus Asn-X-Ser/Thr motif. In most eukaryotes, the DDOST complex is composed of three subunits, which in humans are described as a 48kDa subunit, ribophorin I, and ribophorin II. However, the yeast DDOST appears to consist of six subunits (alpha, beta, gamma, delta, epsilon, zeta). The yeast beta subunit is a 45kDa polypeptide, previously discovered as the Wbp1 protein, with known sequence similarity to the human 48kDa subunit and the other orthologues. This family includes the 48kDa-like subunits from several eukaryotes; it also includes the yeast DDOST beta subunit Wbp1.

    Proteins where this domain is known:
    PFI0960w   


    PTHR10836 - GAP_DH (Panther link)

    Interpro entry IPR000173 : Glyceraldehyde 3-phosphate dehydrogenase (Interpro link)

    Interpro description:

    Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) plays an important role in glycolysis and gluconeogenesis by reversibly catalysing the oxidation and phosphorylation of D-glyceraldehyde-3-phosphate to 1,3-diphospho-glycerate. The enzyme exists as a tetramer of identical subunits, each containing 2 conserved functional domains: an NAD-binding domain, and a highly conserved catalytic domain. The enzyme has been found to bind to actin and tropomyosin, and may thus have a role in cytoskeleton assembly. Alternatively, the cytoskeleton may provide a framework for precise positioning of the glycolytic enzymes, thus permitting efficient passage of metabolites from enzyme to enzyme.

    GAPDH displays diverse non-glycolytic functions as well, its role depending upon its subcellular location. For instance, the translocation of GAPDH to the nucleus acts as a signalling mechanism for programmed cell death, or apoptosis. The accumulation of GAPDH within the nucleus is involved in the induction of apoptosis, where GAPDH functions in the activation of transcription. The presence of GAPDH is associated with the synthesis of pro-apoptotic proteins like BAX, c-JUN and GAPDH itself.

    GAPDH has been implicated in certain neurological diseases: GAPDH is able to bind to the gene products from neurodegenerative disorders such as Huntington's disease, Alzheimer's disease, Parkinson's disease and Machado-Joseph disease through stretches encoded by their CAG repeats. Abnormal neuronal apoptosis is associated with these diseases. Propargylamines such as deprenyl increase neuronal survival by interfering with apoptosis signalling pathways via their binding to GAPDH, which decreases the synthesis of pro-apoptotic proteins.

    Proteins where this domain is known:
    PF14_0598   


    PTHR10840 - TFAR19_DNA_bd (Panther link)

    Interpro entry IPR002836 : DNA-binding TFAR19-related protein (Interpro link)

    Interpro description:

    This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.

    Proteins where this domain is known:
    PFI0450c   


    PTHR10848 - Spo11/TopoVI_A (Panther link)

    Interpro entry IPR002815 : Spo11/DNA topoisomerase VI, subunit A (Interpro link)

    Interpro description:

    This entry represents Spo11, a meiotic recombination protein found in eukaryotes, and subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea. These two types of proteins share structural homology.

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.

    Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF10_0412    PFL0825c   


    PTHR10853 - PelA (Panther link)

    Interpro entry IPR004405 : Probable translation factor pelota (Interpro link)

    Interpro description:

    The Drosophila melanogaster protein pelota is proposed to act in protein translation. It can replace the budding yeast protein DOM34, and is closely related to a set of archaeal proteins. This family contains a proposed RNA binding motif, and is homologous to a family of peptide chain release factors. In Drosophila melanogaster it is required prior to the first meiotic division for spindle formation and nuclear envelope breakdown during spermatogenesis. It is also required for normal eye patterning and for mitotic divisions in the ovary. The meiotic defect in pelota mutants may be a complex result of a protein translation defect, as suggested in yeast by ribosomal protein RPS30A being a multicopy suppressor, and by an altered polyribosome profile in DOM34 mutants rescued by RPS30A.

    Proteins where this domain is known:
    MAL7P1.118   


    PTHR10855 - PTHR10855 (Panther link)

    Proteins where this domain is known:
    PF10_0174   


    PTHR10855:SF1 - PTHR10855:SF1 (Panther link)

    Proteins where this domain is known:
    PF10_0174   


    PTHR10856 - Coronin (Panther link)

    Interpro entry IPR015505 : (Interpro link)

    Interpro description:

    Coronin is an actin-binding protein that belongs to the WD40-repeat family proteins and contains 5 WD40 repeats. The WD40 motif is found in a multitude of eukaryotic proteins involved in a variety of cellular processes. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The final 40 amino acids are predicted to form a coiled-coil in a coronin homodimer.

    Coronin has been found to localise to actin-rich regions of the cell and binds to F-actin with a Kd of 1-5 nM in yeast. It has also been shown to have actin bundling and nucleation activity, and can bind to microtubules in vivo. Coronin has also been shown to bind to and inhibit the Arp2/3 complex in yeast.

    Proteins where this domain is known:
    PFL2460w   


    PTHR10859 - PTHR10859 (Panther link)

    Proteins where this domain is known:
    PF11_0427   


    PTHR10869 - PTHR10869 (Panther link)

    Proteins where this domain is known:
    MAL8P1.8   


    PTHR10869:SF1 - PTHR10869:SF1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.8   


    PTHR10871 - PTHR10871 (Panther link)

    Proteins where this domain is known:
    PF11_0272   


    PTHR10871:SF2 - PTHR10871:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0272   


    PTHR10876 - PTHR10876 (Panther link)

    Proteins where this domain is known:
    PF13_0313   


    PTHR10882 - Dphthn_synthase (Panther link)

    Interpro entry IPR004551 : Diphthine synthase (Interpro link)

    Interpro description:

    Diphthine synthase, also known as diphthamide biosynthesis S-adenosylmethionine-dependent methyltransferase, participates in the modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. It is required for the methylation step in dipthamide biosynthesis. The protein was characterised in Saccharomyces cerevisiae and designated DPH5.

    Proteins where this domain is known:
    PF10_0087   


    PTHR10887 - PTHR10887 (Panther link)

    Proteins where this domain is known:
    MAL13P1.13    MAL7P1.12    PF10_0057    PF10_0099    PF11_0078    PF13_0187    PF13_0273    PFL1650w   


    PTHR10887:SF14 - PTHR10887:SF14 (Panther link)

    Proteins where this domain is known:
    PF11_0078   


    PTHR10887:SF18 - PTHR10887:SF18 (Panther link)

    Proteins where this domain is known:
    PF13_0187   


    PTHR10887:SF26 - PTHR10887:SF26 (Panther link)

    Proteins where this domain is known:
    MAL7P1.12    PF10_0057   


    PTHR10887:SF27 - PTHR10887:SF27 (Panther link)

    Proteins where this domain is known:
    MAL13P1.13    PF10_0099   


    PTHR10887:SF5 - PTHR10887:SF5 (Panther link)

    Proteins where this domain is known:
    PF13_0273   


    PTHR10889 - DeoC (Panther link)

    Interpro entry IPR011343 : Deoxyribose-phosphate aldolase (Interpro link)

    Interpro description:

    Class I aldolases catalyse carbon-carbon bond formation using a 'Schiff base' mechanism. This entry represents deoxyribose-phosphate aldolase, a widely distributed enzyme, which catalyses the following reversible reaction:

     2-deoxy-D-ribose 5-phosphate = D-glyceraldehyde 3-phosphate + acetaldehyde
    While the physiological role of this enzyme remains unknown in eukaryotes, in prokaroytes it is thought to function in the catabolism of deoxyribonucleotides.

    In all studied structures, the deoxyribose-phophate aldolase subunits adopt the classical eight-bladed TIM barrel fold. The oligomerisation state of the enzyme appears to depend on the living temperature of the organism - the Escherichia coli enzyme is a homodimer, while the enzymes from the thermophilic microorganisms Thermus thermophilus and Aeropyrum pernix are homotetramers. The degree of oligomerisation does not, however, appear to affect catalysis.

    Proteins where this domain is known:
    PF10_0210   


    PTHR10890 - Cys_tRNA-synt_1a (Panther link)

    Interpro entry IPR015804 : Cysteinyl-tRNA synthetase, class Ia, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Cysteinyl-tRNA synthetase is an alpha monomer and belongs to class Ia.

    Proteins where this domain is known:
    PF10_0149   


    PTHR10891 - PTHR10891 (Panther link)

    Proteins where this domain is known:
    MAL7P1.10    PF10_0271    PF10_0301    PF11_0066    PF11_0389    PF14_0420    PF14_0443    PFL2225w   


    PTHR10891:SF2 - PTHR10891:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0066   


    PTHR10891:SF41 - PTHR10891:SF41 (Panther link)

    Proteins where this domain is known:
    MAL7P1.10    PF10_0271    PF14_0443   


    PTHR10891:SF44 - PTHR10891:SF44 (Panther link)

    Proteins where this domain is known:
    PF10_0301    PFL2225w   


    PTHR10891:SF7 - PTHR10891:SF7 (Panther link)

    Proteins where this domain is known:
    PF14_0420   


    PTHR10891:SF8 - PTHR10891:SF8 (Panther link)

    Proteins where this domain is known:
    PF11_0389   


    PTHR10894 - PTHR10894 (Panther link)

    Proteins where this domain is known:
    PF10_0085    PF11_0191   


    PTHR10898 - PTHR10898 (Panther link)

    Proteins where this domain is known:
    PF13_0358   


    PTHR10898:SF10 - PTHR10898:SF10 (Panther link)

    Proteins where this domain is known:
    PF13_0358   


    PTHR10902 - Ribosomal_L35AE (Panther link)

    Interpro entry IPR001780 : Ribosomal protein L35Ae (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:

    These proteins have 87 to 110 amino-acid residues.

    Proteins where this domain is known:
    PF11_0438   


    PTHR10906 - SecY (Panther link)

    Interpro entry IPR002208 : SecY protein (Interpro link)

    Interpro description:

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.

    The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.

    The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.

    Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.

    Proteins where this domain is known:
    MAL13P1.231   


    PTHR10906:SF1 - PREPROTEIN TRANSLOCASE SECY SUBUNIT (SEC61) (Panther link)

    Proteins where this domain is known:
    MAL13P1.231   


    PTHR10917 - RNA_pol_Rpb8 (Panther link)

    Interpro entry IPR005570 : RNA polymerase, Rpb8 (Interpro link)

    Interpro description:
    Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits.

    Proteins where this domain is known:
    PFL0665c   


    PTHR10920 - rRNA-MeTfrase_J (Panther link)

    Interpro entry IPR015507 : Ribosomal RNA methyltransferase J (Interpro link)

    Interpro description:

    The ribosomal RNA large subunit methyltransferase Jmethylates the 23S rRNA. It specifically methylates the uridine in position 2552 of 23s rRNA in the 50S particle using S-adenosyl-L-methionine as a substrate. It was previously known as cell division protein ftsJ.

    Proteins where this domain is known:
    PF13_0052    PF13_0286    PFI0415c   


    PTHR10925 - PTHR10925 (Panther link)

    Proteins where this domain is known:
    PF10_0200   


    PTHR10925:SF5 - PTHR10925:SF5 (Panther link)

    Proteins where this domain is known:
    PF10_0200   


    PTHR10926 - DUF284_TM_euk (Panther link)

    Interpro entry IPR005045 : Protein of unknown function DUF284, transmembrane eukaryotic (Interpro link)

    Interpro description:
    Members of this family have no known function. They have predicted transmembrane helices.

    Proteins where this domain is known:
    PF07_0078    PF10_0287    PF11_0343   


    PTHR10927 - SBDS (Panther link)

    Interpro entry IPR002140 : (Interpro link)

    Interpro description:
    The proteins in this entry are highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterised by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism. In yeast these proteins have been shown to be critical for the release and recycling of the nucleolar shuttling factor Tif6 from pre-60S ribosomes, a key step in 60S maturation and translational activation of ribosomes. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition.

    A number of uncharacterised hydrophilic proteins of about 30 kDa share regions of similarity. These include,

    Proteins where this domain is known:
    PF14_0107   


    PTHR10933 - TAP42 (Panther link)

    Interpro entry IPR007304 : TAP42-like protein (Interpro link)

    Interpro description:
    The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 interacts with TAP42 and negatively regulates the TOR signalling pathway.

    Proteins where this domain is known:
    PF07_0102   


    PTHR10933:SF1 - PTHR10933:SF1 (Panther link)

    Proteins where this domain is known:
    PF07_0102   


    PTHR10934 - Ribosomal_L18e (Panther link)

    Interpro entry IPR000039 : Ribosomal protein L18e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Members of this family are large subunit ribosomal proteins which are found in the Eukaryota and Archaea. These proteins have 115 to 187 amino-acid residues. The family consists of:

    Proteins where this domain is known:
    MAL13P1.209   


    PTHR10937 - PTHR10937 (Panther link)

    Proteins where this domain is known:
    PF10_0245   


    PTHR10938 - IF3 (Panther link)

    Interpro entry IPR001288 : Initiation factor 3 (Interpro link)

    Interpro description:

    Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic protein that binds to the 30S ribosomal subunit. The chloroplast initiation factor IF-3(chl) is a protein that enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal 30s subunits in which the central section is evolutionary related to the sequence of bacterial IF-3.

    Proteins where this domain is known:
    MAL8P1.27   


    PTHR10943 - PTHR10943 (Panther link)

    Proteins where this domain is known:
    PF14_0632    PFB0260w   


    PTHR10943:SF1 - PTHR10943:SF1 (Panther link)

    Proteins where this domain is known:
    PFB0260w   


    PTHR10943:SF2 - PTHR10943:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0632   


    PTHR10947 - PTHR10947 (Panther link)

    Proteins where this domain is known:
    PF11_0051   


    PTHR10949 - Lipoate_synth (Panther link)

    Interpro entry IPR003698 : Lipoate synthase (Interpro link)

    Interpro description:
    Lipoic acid is a covalently bound disulphide-containing cofactor required for function of the pyruvate dehydrogenase, alpha-ketoglutarate dehydrogenase, and glycine cleavage enzyme complexes of Escherichia coli. Two genes, lipA and lipB, are involved in lipoic acid biosynthesis or metabolism. LipA is required for the insertion of the first sulphur into the octanoic acid backbone. LipB functions downstream of LipA, but its role in lipoic acid metabolism remains unclear. Lipoate synthase (or lipoic acid synthetase) catalyses the formation of alpha-(+)-lipoic acid, required for lipoate biosynthesis.

    Proteins where this domain is known:
    MAL13P1.220   


    PTHR10953 - UBIQUITIN-ACTIVATING ENZYME E1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.75    PF11_0271    PF11_0457    PF13_0182    PF13_0264    PF13_0344    PFL1245w    PFL1790w   


    PTHR10953:SF3 - PTHR10953:SF3 (Panther link)

    Proteins where this domain is known:
    PF11_0271   


    PTHR10953:SF30 - PTHR10953:SF30 (Panther link)

    Proteins where this domain is known:
    PF11_0457    PF13_0182    PF13_0264   


    PTHR10953:SF4 - PTHR10953:SF4 (Panther link)

    Proteins where this domain is known:
    PFL1245w   


    PTHR10953:SF5 - PTHR10953:SF5 (Panther link)

    Proteins where this domain is known:
    PFL1790w   


    PTHR10953:SF6 - UBIQUITIN-ACTIVATING ENZYME E1C (Panther link)

    Proteins where this domain is known:
    MAL8P1.75   


    PTHR10953:SF8 - PTHR10953:SF8 (Panther link)

    Proteins where this domain is known:
    PF13_0344   


    PTHR10954 - RNase_HII/HIII (Panther link)

    Interpro entry IPR001352 : Ribonuclease HII/HIII (Interpro link)

    Interpro description:

    Ribonuclease HII is involved in the degradation of the ribonucleotide moiety on RNA-DNA hybrid molecules carrying out endonucleolytic cleavage to 5'-phospo-monoester. Proteins which belong to this family have been found in bacteria, archaea, and yeasts. This family also includes Ribonuclease HIII.

    Proteins where this domain is known:
    PFF1150w   


    PTHR10956 - Ribosomal_L31e (Panther link)

    Interpro entry IPR000054 : Ribosomal protein L31e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:

  • Yeast L34
  • Archaeal L31
  • Plants L31
  • Mammalian L31
  • Proteins where this domain is known:
    PFE0185c   


    PTHR10965 - Ribosomal_L38e (Panther link)

    Interpro entry IPR002675 : Ribosomal protein L38e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L38e forms part of the 60S ribosomal subunit. This family is found in eukaryotes.

    Proteins where this domain is known:
    PF11_0312   


    PTHR10967 - PTHR10967 (Panther link)

    Proteins where this domain is known:
    MAL7P1.113    MAL7P1.201    MAL8P1.19    PF08_0096    PF08_0111    PF10_0209    PF10_0309    PF11_0077    PF11_0254    PF13_0037    PF13_0077    PF13_0177    PF14_0183    PF14_0185    PF14_0429    PF14_0436    PF14_0437    PF14_0563    PF14_0655    PFA0180w    PFB0445c    PFB0860c    PFC0915w    PFC0955w    PFD0245c    PFD0565c    PFD1070w    PFE0205w    PFE0215w    PFE0430w    PFE0925c    PFE1085w    PFE1390w    PFF1500c    PFL0100c    PFL1310c    PFL2010c    PFL2475w   


    PTHR10967:SF11 - DEAD-BOX PROTEIN 47 (Panther link)

    Proteins where this domain is known:
    PFB0860c   


    PTHR10967:SF14 - gb def: Putative ATP-dependent RNA helicase (Panther link)

    Proteins where this domain is known:
    PF13_0177   


    PTHR10967:SF19 - ATP-DEPENDENT RNA HELICASE DDX27 (DEAD-BOX PROTEIN 27) (Panther link)

    Proteins where this domain is known:
    PFL2475w   


    PTHR10967:SF2 - EUKARYOTIC INITIATION FACTOR 4A (Panther link)

    Proteins where this domain is known:
    PF14_0655    PFD1070w   


    PTHR10967:SF20 - PTHR10967:SF20 (Panther link)

    Proteins where this domain is known:
    PF14_0429   


    PTHR10967:SF21 - PTHR10967:SF21 (Panther link)

    Proteins where this domain is known:
    PF10_0209   


    PTHR10967:SF22 - DEAD-BOX PROTEIN 21, 50 (Panther link)

    Proteins where this domain is known:
    PFE0215w   


    PTHR10967:SF23 - PTHR10967:SF23 (Panther link)

    Proteins where this domain is known:
    PFF1500c   


    PTHR10967:SF25 - ATP-DEPENDENT RNA HELICASE DBP7 (DEAD-BOX PROTEIN 7) (Panther link)

    Proteins where this domain is known:
    MAL7P1.113   


    PTHR10967:SF27 - gb def: RNA helicase, putative (Panther link)

    Proteins where this domain is known:
    PF14_0183   


    PTHR10967:SF28 - PTHR10967:SF28 (Panther link)

    Proteins where this domain is known:
    PF13_0037   


    PTHR10967:SF29 - PTHR10967:SF29 (Panther link)

    Proteins where this domain is known:
    PFL2010c   


    PTHR10967:SF30 - PTHR10967:SF30 (Panther link)

    Proteins where this domain is known:
    MAL8P1.19   


    PTHR10967:SF31 - PTHR10967:SF31 (Panther link)

    Proteins where this domain is known:
    MAL7P1.201    PFD0245c    PFL0100c   


    PTHR10967:SF33 - PTHR10967:SF33 (Panther link)

    Proteins where this domain is known:
    PFC0955w   


    PTHR10967:SF37 - gb def: DEAD/DEAH box helicase, putative (Fragment) (Panther link)

    Proteins where this domain is known:
    PF14_0185   


    PTHR10967:SF4 - PTHR10967:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0563   


    PTHR10967:SF42 - PTHR10967:SF42 (Panther link)

    Proteins where this domain is known:
    PF11_0254   


    PTHR10967:SF44 - ATP-DEPENDENT HELICASE DDX1 (DEAD-BOX PROTEIN 1) (Panther link)

    Proteins where this domain is known:
    PF11_0077    PFE1085w   


    PTHR10967:SF45 - PTHR10967:SF45 (Panther link)

    Proteins where this domain is known:
    PFE0430w   


    PTHR10967:SF46 - PTHR10967:SF46 (Panther link)

    Proteins where this domain is known:
    PFE1390w   


    PTHR10967:SF47 - PTHR10967:SF47 (Panther link)

    Proteins where this domain is known:
    PFE0925c   


    PTHR10967:SF5 - DEAD (ASP-GLU-ALA-ASP) BOX POLYPEPTIDE 39 AND P47 (Panther link)

    Proteins where this domain is known:
    PFB0445c   


    PTHR10967:SF50 - PTHR10967:SF50 (Panther link)

    Proteins where this domain is known:
    PF14_0436    PF14_0437    PFL1310c   


    PTHR10967:SF6 - ATP-DEPENDENT RNA HELICASE P54 (Panther link)

    Proteins where this domain is known:
    PFC0915w   


    PTHR10967:SF60 - gb def: DEAD box polypeptide, Y chromosome-related (Panther link)

    Proteins where this domain is known:
    PF08_0096   


    PTHR10967:SF9 - PTHR10967:SF9 (Panther link)

    Proteins where this domain is known:
    PF08_0111   


    PTHR10969 - MAP1_LC3 (Panther link)

    Interpro entry IPR004241 : (Interpro link)

    Interpro description:
    Light chain 3 (LC3) may function primarily as a MAP1A and MAP1B subunit and its expression may regulate the microtubule binding activity of of the neuronal microtubule-associated proteins (MAPs), MAP1A and MAP1B. Related proteins that belong to this group include the human ganglioside expression factor and a symbiosis-related fungal protein.

    Proteins where this domain is known:
    PF10_0193   


    PTHR10969:SF2 - PTHR10969:SF2 (Panther link)

    Proteins where this domain is known:
    PF10_0193   


    PTHR10971 - PTHR10971 (Panther link)

    Proteins where this domain is known:
    PF13_0250   


    PTHR10971:SF2 - PTHR10971:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0250   


    PTHR10972 - Oxysterol_bd (Panther link)

    Interpro entry IPR000648 : Oxysterol-binding protein (Interpro link)

    Interpro description:
    A number of eukaryotic proteins that seem to be involved with sterol synthesis and/or its regulation have been found to be evolutionary related. These include mammalian oxysterol-binding protein (OSBP), a protein of about 800 amino-acid residues that binds a variety of oxysterols (oxygenated derivatives of cholesterol); yeast OSH1, a protein of 859 residues that also plays a role in ergosterol synthesis; yeast proteins HES1 and KES1, highly related proteins of 434 residues that seem to play a role in ergosterol synthesis; and yeast hypothetical proteins YHR001w, YHR073w and YKR003w.

    Proteins where this domain is known:
    PF11_0327   


    PTHR10972:SF10 - PTHR10972:SF10 (Panther link)

    Proteins where this domain is known:
    PF11_0327   


    PTHR10982 - MALONYL COENZYME A-ACYL CARRIER PROTEIN TRANSACYLASE-RELATED (Panther link)

    Proteins where this domain is known:
    PF13_0066   


    PTHR10982:SF4 - MALONYL COA-ACYL CARRIER PROTEIN TRANSACYLASE (Panther link)

    Proteins where this domain is known:
    PF13_0066   


    PTHR10986 - Ribosomal_L20 (Panther link)

    Interpro entry IPR005813 : Ribosomal protein L20 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L20 is a protein from the large (50S) subunit; in Escherichia coli it is known to bind directly to the 23S rRNA, and is required for ribosome assembly, but does not take part in protein synthesis. It belongs to a family of ribosomal proteins, including L20 from eubacteria, plant and alga chloroplasts and cyanelles.

    Proteins where this domain is known:
    PF14_0709   


    PTHR10992 - PTHR10992 (Panther link)

    Proteins where this domain is known:
    PF08_0022    PF11_0441    PF14_0015    PF14_0099    PFC0065c   


    PTHR10992:SF17 - PTHR10992:SF17 (Panther link)

    Proteins where this domain is known:
    PF08_0022    PF11_0441    PF14_0015    PFC0065c   


    PTHR10993 - PTHR10993 (Panther link)

    Proteins where this domain is known:
    MAL8P1.37   


    PTHR10997 - IMPORTIN (RAN-BINDING PROTEIN) (Panther link)

    Proteins where this domain is known:
    MAL7P1.202    PF14_0304    PF14_0549    PFI1590c   


    PTHR10997:SF1 - PTHR10997:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1590c   


    PTHR10997:SF18 - IMPORTIN 7, 8 (IMP7, 8) (RAN-BINDING PROTEIN 7, 8) (Panther link)

    Proteins where this domain is known:
    MAL7P1.202   


    PTHR10997:SF2 - PTHR10997:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0304   


    PTHR10997:SF3 - PTHR10997:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0549   


    PTHR11005 - PTHR11005 (Panther link)

    Proteins where this domain is known:
    PF11_0276   


    PTHR11006 - PTHR11006 (Panther link)

    Proteins where this domain is known:
    PF08_0092    PF14_0242   


    PTHR11006:SF18 - PTHR11006:SF18 (Panther link)

    Proteins where this domain is known:
    PF14_0242   


    PTHR11006:SF8 - PTHR11006:SF8 (Panther link)

    Proteins where this domain is known:
    PF08_0092   


    PTHR11009 - DER1 (Panther link)

    Interpro entry IPR007599 : (Interpro link)

    Interpro description:

    The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae (Baker's yeast) contains a proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process, and the classes were called der for degradation in the ER. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein that is localized to the ER. Deletion of DER1 abolished degradation of the substrate proteins, suggesting that the function of the Der1 protein may be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. This family may also mediate degradation of misfolded proteins.

    Proteins where this domain is known:
    PF10_0317    PF14_0498    PF14_0653   


    PTHR11019 - PTHR11019 (Panther link)

    Proteins where this domain is known:
    PFF1335c   


    PTHR11019:SF4 - PTHR11019:SF4 (Panther link)

    Proteins where this domain is known:
    PFF1335c   


    PTHR11021 - PTHR11021 (Panther link)

    Proteins where this domain is known:
    PF11_0280    PF13_0142   


    PTHR11024 - PTHR11024 (Panther link)

    Proteins where this domain is known:
    PFL1480w   


    PTHR11024:SF2 - PTHR11024:SF2 (Panther link)

    Proteins where this domain is known:
    PFL1480w   


    PTHR11028 - PTHR11028 (Panther link)

    Proteins where this domain is known:
    PF14_0615   


    PTHR11035 - PTPLA (Panther link)

    Interpro entry IPR007482 : (Interpro link)

    Interpro description:

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:

    Based on their cellular localisation, PTPases are also classified as:

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types.

    Proteins where this domain is known:
    MAL13P1.168   


    PTHR11035:SF1 - PTHR11035:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.168   


    PTHR11038 - PTHR11038 (Panther link)

    Proteins where this domain is known:
    PFE0140c    PFL0430w   


    PTHR11040 - PTHR11040 (Panther link)

    Proteins where this domain is known:
    PF10_0216    PFF0450c   


    PTHR11042 - PTHR11042 (Panther link)

    Proteins where this domain is known:
    PF14_0423    PFA0380w    PFF1370w   


    PTHR11042:SF3 - PTHR11042:SF3 (Panther link)

    Proteins where this domain is known:
    PFA0380w   


    PTHR11042:SF7 - PTHR11042:SF7 (Panther link)

    Proteins where this domain is known:
    PF14_0423   


    PTHR11043 - PTHR11043 (Panther link)

    Proteins where this domain is known:
    PFD0745c   


    PTHR11048 - PTHR11048 (Panther link)

    Proteins where this domain is known:
    PFE0970w    PFF0370w   


    PTHR11048:SF2 - PTHR11048:SF2 (Panther link)

    Proteins where this domain is known:
    PFF0370w   


    PTHR11048:SF3 - CyoE_CtaB (Panther link)

    Interpro entry IPR006369 : Protohaem IX farnesyltransferase (Interpro link)

    Interpro description:

    This family describes protoheme IX farnesyltransferase, also called haeme O synthase, an enzyme that creates an intermediate in the biosynthesis of haeme A. Prior to the description of its enzymatic function, this protein was often called a cytochrome o ubiquinol oxidase assembly factor.

    Proteins where this domain is known:
    PFE0970w   


    PTHR11060 - DUF52 (Panther link)

    Interpro entry IPR002737 : (Interpro link)

    Interpro description:

    This entry contains proteins from all branches of life. The molecular function of these proteins are unknown, but Memo (mediator of ErbB2-driven cell motility) a human protein is included in this family. It has been suggested that Memo controls cell migration by relaying extracellular chemotactic signals to the microtubule cytoskeleton.

    Proteins where this domain is known:
    PFD0850c   


    PTHR11061 - PTHR11061 (Panther link)

    Proteins where this domain is known:
    MAL13P1.31    PF11_0348   


    PTHR11061:SF2 - PTHR11061:SF2 (Panther link)

    Proteins where this domain is known:
    MAL13P1.31    PF11_0348   


    PTHR11064 - PTHR11064 (Panther link)

    Proteins where this domain is known:
    PF11_0477    PF13_0043   


    PTHR11064:SF9 - CBFA_NFYB_topo (Panther link)

    Interpro entry IPR003957 : Transcription factor, CBFA/NFYB, DNA topoisomerase (Interpro link)

    Interpro description:

    The CCAAT-binding factor (CBF) is a mammalian transcription factor that binds to a CCAAT motif in the promoters of a wide variety of genes, including type I collagen and albumin. The factor is a heteromeric complex of A and B subunits, both of which are required for DNA-binding. The subunits can interact in the absence of DNA-binding, conserved regions in each being important in mediating this interaction.

    The A subunit can be split into 3 domains on the basis of sequence similarity, a non-conserved N-terminal 'A domain'; a highly-conserved central 'B domain' involved in DNA-binding; and a C-terminal 'C domain', which contains a number of glutamine and acidic residues involved in protein-protein interactions. The A subunit shows striking similarity to the HAP3 subunit of the yeast CCAAT-binding heterotrimeric transcription factor. The Kluyveromyces lactis HAP3 protein has been predicted to contain a 4-cysteine zinc finger, which is thought to be present in similar HAP3 and CBF subunit A proteins, in which the third cysteine is replaced by a serine. This family also includes DNA topoisomerase II, which controls the topology of DNA by transient breaking of the strands and rejoining.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF11_0477   


    PTHR11067 - Ham1p_like (Panther link)

    Interpro entry IPR002637 : Ham1-like protein (Interpro link)

    Interpro description:

    This family contains the Saccharomyces cerevisiae (Baker's yeast) HAM1 proteinand other hypothetical archaeal, bacterial and Caenorhabditis elegans proteins. S. cerevisiae HAM1 protects against the mutagenic effects of the base analog 6-N-hydroxylaminopurine (HAP) which can be a natural product of monooxygenase activity on adenine. HAM1 protein protects the cell from HAP, either on the level of deoxynucleoside triphosphate or the DNA level by a yet unidentified set of reactions.

    Proteins where this domain is known:
    MAL7P1.110   


    PTHR11067:SF7 - PTHR11067:SF7 (Panther link)

    Proteins where this domain is known:
    MAL7P1.110   


    PTHR11070 - UvrD_helicase (Panther link)

    Interpro entry IPR000212 : DNA helicase, UvrD/REP type (Interpro link)

    Interpro description:

    Members of this family are helicases that catalyse ATP dependent unwinding of double stranded DNA to single stranded DNA. THe family includes both Rep and UvrD helcases. The Rep family helicases are composed of four structural domains. The Rep proteins function as dimers.

    Proteins where this domain is known:
    PFE0705c   


    PTHR11071 - PPIase_cyclophilin (Panther link)

    Proteins where this domain is known:
    PF08_0086    PF08_0121    PF11_0164    PF11_0170    PF13_0122    PF14_0223    PFC0975c    PFE0505w    PFE1430c    PFI1490c    PFL0120c    PFL0735w   


    PTHR11071:SF10 - PEPTIDYL-PROLYL CIS-TRANS ISOMERASE (CYCLOPHILIN) (Panther link)

    Proteins where this domain is known:
    PF14_0223   


    PTHR11071:SF27 - PTHR11071:SF27 (Panther link)

    Proteins where this domain is known:
    PF08_0086   


    PTHR11071:SF31 - PTHR11071:SF31 (Panther link)

    Proteins where this domain is known:
    PFL0735w   


    PTHR11071:SF38 - PTHR11071:SF38 (Panther link)

    Proteins where this domain is known:
    PF11_0170    PFI1490c   


    PTHR11071:SF40 - PTHR11071:SF40 (Panther link)

    Proteins where this domain is known:
    PFE0505w   


    PTHR11071:SF41 - PTHR11071:SF41 (Panther link)

    Proteins where this domain is known:
    PFE1430c   


    PTHR11071:SF46 - PTHR11071:SF46 (Panther link)

    Proteins where this domain is known:
    PFL0120c   


    PTHR11071:SF55 - CYCLOPHILIN-6 (Panther link)

    Proteins where this domain is known:
    PF11_0164   


    PTHR11071:SF58 - PTHR11071:SF58 (Panther link)

    Proteins where this domain is known:
    PF08_0121   


    PTHR11071:SF76 - PTHR11071:SF76 (Panther link)

    Proteins where this domain is known:
    PF13_0122   


    PTHR11075 - PTHR11075 (Panther link)

    Proteins where this domain is known:
    MAL7P1.20    PF14_0265    PFD0480w    PFI1575c   


    PTHR11075:SF5 - PTHR11075:SF5 (Panther link)

    Proteins where this domain is known:
    PFI1575c   


    PTHR11075:SF6 - PrfB (Panther link)

    Interpro entry IPR004374 : Peptide chain release factor 2 (Interpro link)

    Interpro description:
    In many but not all taxa, there is a conserved real translational frameshift at a TGA codon. RF-2 helps terminate translation at TGA codons and can therefore regulate its own production by readthrough when RF-2 is insufficient. There is a superfamilyof RF-1, RF-2, mitochondrial, RF-H, etc proteins.

    Proteins where this domain is known:
    MAL7P1.20   


    PTHR11075:SF9 - PrfA (Panther link)

    Interpro entry IPR004373 : Peptide chain release factor 1 (Interpro link)

    Interpro description:
    This family describes peptide chain release factor 1 (PrfA, RF-1), and excludes the related peptide chain release factor 2 (PrfB, RF-2). RF-1 helps recognise and terminate translation at UAA and UAG stop codons. The mitochondrial release factors are prfA-like, although not included above the trusted cut-off for this model. RF-1 does not have a translational frameshift.

    Proteins where this domain is known:
    PF14_0265   


    PTHR11076 - UMUC_like (Panther link)

    Interpro entry IPR001126 : DNA-repair protein, UmuC-like (Interpro link)

    Interpro description:

    In Escherichia coli, UV and many chemicals appear to cause mutagenesis by a process of translesion synthesis that requires DNA polymerase III and the SOS-regulated proteins UmuD, UmuC and RecA. This machinery allows the replication to continue through DNA lesion, and therefore avoid lethal interruption of DNA replication after DNA damage. UmuC is a well conserved protein in prokaryotes, with a homologue in yeast species.

    Proteins currently known to belong to this family are listed below:

    Proteins where this domain is known:
    PFI0510c   


    PTHR11079 - PTHR11079 (Panther link)

    Proteins where this domain is known:
    PF13_0259    PFL0230w   


    PTHR11079:SF3 - PTHR11079:SF3 (Panther link)

    Proteins where this domain is known:
    PFL0230w   


    PTHR11079:SF9 - PTHR11079:SF9 (Panther link)

    Proteins where this domain is known:
    PF13_0259   


    PTHR11080 - PTHR11080 (Panther link)

    Proteins where this domain is known:
    PFC0910w   


    PTHR11080:SF2 - PTHR11080:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0910w   


    PTHR11081 - XPGC_Rad (Panther link)

    Interpro entry IPR006084 : DNA repair protein (XPGC)/yeast Rad (Interpro link)

    Interpro description:

    Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    Proteins where this domain is known:
    PF07_0105    PF10_0080    PFB0265c    PFD0420c   


    PTHR11081:SF6 - PTHR11081:SF6 (Panther link)

    Proteins where this domain is known:
    PFB0265c   


    PTHR11081:SF8 - PTHR11081:SF8 (Panther link)

    Proteins where this domain is known:
    PF07_0105   


    PTHR11081:SF9 - FLAP ENDONUCLEASE-1 (Panther link)

    Proteins where this domain is known:
    PF10_0080    PFD0420c   


    PTHR11082 - Du_synth (Panther link)

    Interpro entry IPR001269 : tRNA-dihydrouridine synthase (Interpro link)

    Interpro description:

    Members of this family catalyse the reduction of the 5,6-double bond of a uridine residue on tRNA. Dihydrouridine modification of tRNA is widely observed in prokaryotes and eukaryotes, and also in some archae. Most dihydrouridines are found in the D loop of t-RNAs. The role of dihydrouridine in tRNA is currently unknown, but may increase conformational flexibility of the tRNA. It is likely that different family members have different substrate specificities, which may overlap. Dus 1 from Saccharomyces cerevisiae (Baker's yeast) acts on pre-tRNA-Phe, while Dus 2 acts on pre-tRNA-Tyr and pre-tRNA-Leu. Dus 1 is active as a single subunit, requiring NADPH or NADH, and is stimulated by the presence of FAD. Some family members may be targeted to the mitochondria and even have a role in mitochondria.

    Proteins where this domain is known:
    PF14_0086    PFI0920c   


    PTHR11082:SF5 - PTHR11082:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0086   


    PTHR11082:SF9 - PTHR11082:SF9 (Panther link)

    Proteins where this domain is known:
    PFI0920c   


    PTHR11085 - SIR2 (Panther link)

    Interpro entry IPR003000 : NAD-dependent histone deacetylase, silent information regulator Sir2 (Interpro link)

    Interpro description:
    These sequences represent the Sir2 family of NAD+-dependent deacetylases. Silent Information Regulator protein of Saccharomyces cerevisiae (Sir2p) is one of several factors critical for silencing at least three loci. Among them, it is unique because it silences the rDNA as well as the mating type loci and telomeres. Sir2p interacts in a complex with itself and with Sir3p and Sir4p, two proteins that are able to interact with nucleosomes. In addition Sir2p also interacts with ubiquitination factors and/or complexes. Unlike Sir3p and Sir4p, for which no homologues are known, Sir2p is part of a multigene family in yeast, the homolgues being HST1, HST2, HST3 and HST4. Highly conserved structural homologues also occur in other organisms ranging from bacteria to man and plants. Proteins of this family have been proposed to play a role in silencing, chromosome stability and agein. In addition, an in vitro ADP ribosyltransferase activity has been associated with Escherichia coli and human members of this family. Homologues of Sir2 share a core domain including the GAG and NID motifs and a putative C4 Zinc finger. The regions containing these three conserved motifs are individually essential for Sir2 silencing function, as are the four cysteins. In addition, the conserved residues HG next to the putative Zn finger have been shown to be essential for the ADP ribosyltransferase activity. Sir2-like enzymes catalyze a reaction in which the cleavage of NAD(+)and histone and/or protein deacetylation are coupled to the formation of O-acetyl-ADP-ribose, a novel metabolite. The dependence of the reaction on both NAD(+) and the generation of this potential second messenger offers new clues to understanding the function and regulation of nuclear, cytoplasmic and mitochondrial Sir2-like enzymes.

    Proteins where this domain is known:
    PF13_0152   


    PTHR11088 - IPPT (Panther link)

    Interpro entry IPR002627 : tRNA isopentenyltransferase (Interpro link)

    Interpro description:
    tRNA isopentenyltransferasesalso known as tRNA delta(2)-isopentenylpyrophosphate transferases or IPP transferases. These enzymes modify both cytoplasmic and mitochondrial tRNAs at A(37) to give isopentenyl A(37).

    Proteins where this domain is known:
    PFL0380c   


    PTHR11088:SF21 - PTHR11088:SF21 (Panther link)

    Proteins where this domain is known:
    PFL0380c   


    PTHR11089 - PTHR11089 (Panther link)

    Proteins where this domain is known:
    PF14_0221    PF14_0292    PF14_0345    PF14_0564    PFD0530c    PFE1435c   


    PTHR11089:SF10 - PTHR11089:SF10 (Panther link)

    Proteins where this domain is known:
    PF14_0221   


    PTHR11089:SF2 - PTHR11089:SF2 (Panther link)

    Proteins where this domain is known:
    PFE1435c   


    PTHR11089:SF3 - PTHR11089:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0564   


    PTHR11089:SF4 - PTHR11089:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0345    PFD0530c   


    PTHR11089:SF7 - PTHR11089:SF7 (Panther link)

    Proteins where this domain is known:
    PF14_0292   


    PTHR11093 - PTHR11093 (Panther link)

    Proteins where this domain is known:
    PF08_0100    PF11_0071    PF13_0330   


    PTHR11093:SF1 - PTHR11093:SF1 (Panther link)

    Proteins where this domain is known:
    PF08_0100    PF11_0071   


    PTHR11093:SF2 - PTHR11093:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0330   


    PTHR11096 - RNA3'_term_phos_cycl (Panther link)

    Interpro entry IPR000228 : (Interpro link)

    Interpro description:
    RNA cyclases are a family of RNA-modifying enzymes that are conserved in eukaryotes, bacteria and archaea. RNA 3'-terminal phosphate cyclase catalyses the conversion of 3'-phosphate to a 2',3'-cyclic phosphodiester at the end of RNA.
     ATP + RNA 3'-terminal-phosphate = AMP + diphosphate + RNA terminal-2',3'-cyclic-phosphate 
    These enzymes might be responsible for production of the cyclic phosphate RNA ends that are known to be required by many RNA ligases in both prokaryotes and eukaryotes.

    RNA cyclase is a protein of from 36 to 42 kDa. The best conserved region is a glycine-rich stretch of residues located in the central part of the sequence and which is reminiscent of various ATP, GTP or AMP glycine-rich loops.

    The crystal structure of RNA 3'-terminal phosphate cyclase shows that each molecule consists of two domains. The larger domain contains three repeats of a folding unit comprising two parallel alpha helices and a four-stranded beta sheet; this fold was previously identified in translation initiation factor 3 (IF3). The large domain is similar to one of the two domains of 5-enolpyruvylshikimate-3-phosphate synthase and UDP-N-acetylglucosamine enolpyruvyl transferase. The smaller domain uses a similar secondary structure element with different topology, observed in many other proteins such as thioredoxin. Although the active site of this enzyme could not be unambiguously assigned, it can be mapped to a region surrounding His309, an adenylate acceptor, in which a number of amino acids are highly conserved in the enzyme from different sources.

    Proteins where this domain is known:
    PF14_0677   


    PTHR11097 - PTHR11097 (Panther link)

    Proteins where this domain is known:
    MAL13P1.204    PF13_0340   


    PTHR11097:SF5 - PTHR11097:SF5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.204   


    PTHR11097:SF6 - PTHR11097:SF6 (Panther link)

    Proteins where this domain is known:
    PF13_0340   


    PTHR11098 - PTHR11098 (Panther link)

    Proteins where this domain is known:
    PFF1410c   


    PTHR11098:SF1 - NAPRTase (Panther link)

    Interpro entry IPR007229 : Nicotinate phosphoribosyltransferase-related (Interpro link)

    Interpro description:
    Nicotinate phosphoribosyltransferase is the rate-limiting enzyme that catalyses the first reaction in the NAD salvage synthesis. This family also contains a number of closely related proteins for which a catalytic activity has not been experimentally demonstrated.

    Proteins where this domain is known:
    PFF1410c   


    PTHR11099 - Vps35 (Panther link)

    Interpro entry IPR005378 : (Interpro link)

    Interpro description:

    The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs.

    To carry out its biological functions, Vps5 forms the retromer complex with at least four other proteins: Vps17, Vps26, Vps29, and Vps35.Vps35 contains a central region of weaker sequence similarity, thought to indicate the presence of at least three domains.

    Proteins where this domain is known:
    PF11_0112   


    PTHR11101 - Phos_transporter (Panther link)

    Interpro entry IPR001204 : Phosphate transporter (Interpro link)

    Interpro description:

    The PHO-4 family of transporters includes the phosphate-repressible phosphate permease (PHO-4) from Neurospora crassa which is probably a sodium-phosphate symporter. This family also includes the human leukemia virus receptor.

    Proteins where this domain is known:
    MAL13P1.206   


    PTHR11101:SF5 - PTHR11101:SF5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.206   


    PTHR11102 - PTHR11102 (Panther link)

    Proteins where this domain is known:
    PF14_0462    PFB0190c    PFC0550w   


    PTHR11102:SF11 - PTHR11102:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0462    PFB0190c   


    PTHR11102:SF9 - PTHR11102:SF9 (Panther link)

    Proteins where this domain is known:
    PFC0550w   


    PTHR11106 - PTHR11106 (Panther link)

    Proteins where this domain is known:
    MAL13P1.74    MAL7P1.83    PF14_0466   


    PTHR11106:SF2 - PTHR11106:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0466   


    PTHR11106:SF4 - PTHR11106:SF4 (Panther link)

    Proteins where this domain is known:
    MAL7P1.83   


    PTHR11108 - Ferrochelatase (Panther link)

    Interpro entry IPR001015 : Ferrochelatase (Interpro link)

    Interpro description:
    Synonym(s): Protohaem ferro-lyase, Iron chelatase, etc.

    Ferrochelatase catalyses the last step in haem biosynthesis: the chelation of a ferrous ion to proto-porphyrin IX, to form protohaem. In eukaryotic cells, it binds to the mitochondrial inner membrane with its active site on the matrix side of the membrane.

    The X-ray structure of Bacillus subtilis and human ferrochelatase have been solved. The human enzyme exists as a homodimer. Each subunit contains one [Fe2S2] cluster. The monomer is folded into two similar domains, each with a four-stranded parallel beta-sheet flanked by an alpha-helix in a beta-alpha-beta motif that is reminiscent of the fold found in the periplasmic binding proteins. The topological similarity between the domains suggests that they have arisen from a gene duplication event. However, significant differences exist between the two domains, including an N-terminal section (residues 80-130) that forms part of the active site pocket, and a C-terminal extension (residues 390-423) that is involved in coordination of the [Fe2S2]cluster and in stabilisation of the homodimer. The [Fe2S2] cluster ligands are Cys196, Cys403, Cys406 and Cys411. The experiments with Co(II) binding show that His230 and Asp383 are part of the enzyme active site.

    Ferrochelatase seems to have a structurally conserved core region that is common to the enzyme from bacteria, plants and mammals. Porphyrin binds in the identified cleft; this cleft also includes the metal-binding site of the enzyme. It is likely that the structure of the cleft region will have different conformations upon substrate binding and release.

    Proteins where this domain is known:
    MAL13P1.326   


    PTHR11109 - GTP_cyclohydro_I (Panther link)

    Interpro entry IPR001474 : GTP cyclohydrolase I (Interpro link)

    Interpro description:

    GTP cyclohydrolase I catalyzes the biosynthesis of formic acid and dihydroneopterin triphosphate from GTP. This reaction is the first step in the biosynthesis of tetrahydrofolate in prokaryotes, of tetrahydrobiopterin in vertebrates, and of pteridine-containing pigments in insects. The comparison of the sequence of the enzyme from bacterial and eukaryotic sources shows that the structure of this enzyme has been extremely well conserved throughout evolution.

    Proteins where this domain is known:
    PFL1155w   


    PTHR11117 - CoA_lig_alpha (Panther link)

    Proteins where this domain is known:
    PF11_0097   


    PTHR11118 - UPF0027 (Panther link)

    Interpro entry IPR001233 : (Interpro link)

    Interpro description:
    A number of uncharacterised proteins including Escherichia coli rtcB, Mycobacterium tuberculosis MtCY441.01., Caenorhabditis elegans F16A11.2 and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0682 belong to this family.

    Proteins where this domain is known:
    PF11_0068    PFL1060c   


    PTHR11124 - PTHR11124 (Panther link)

    Proteins where this domain is known:
    PF14_0064   


    PTHR11125 - PTHR11125 (Panther link)

    Proteins where this domain is known:
    PFF0535c   


    PTHR11127 - PTHR11127 (Panther link)

    Proteins where this domain is known:
    PF14_0296   


    PTHR11129 - PTHR11129 (Panther link)

    Proteins where this domain is known:
    PF14_0403    PFL2050w   


    PTHR11129:SF1 - PTHR11129:SF1 (Panther link)

    Proteins where this domain is known:
    PFL2050w   


    PTHR11129:SF2 - RAB GERANYLGERANYL TRANSFERASE ALPHA SUBUNIT (Panther link)

    Proteins where this domain is known:
    PF14_0403   


    PTHR11130 - GSH_synth_ATP_bd (Panther link)

    Interpro entry IPR005615 : Glutathione synthase, eukaryotic (Interpro link)

    Interpro description:

    This entry represents eukaryotic glutathione synthetase (GSS), a homodimeric enzyme that catalyses the conversion of gamma-L-glutamyl-L-cysteine and glycine to phosphate and glutathione in the presence of ATP. This is the second step in glutathione biosynthesis, the first step being catalysed by gamma-glutamylcysteine synthetase. In humans, defects in GSS are inherited in an autosomal recessive way and are the cause of severe metabolic acidosis, 5-oxoprolinuria, and increased rate of haemolysis and defective function of the central nervous system.

    Proteins where this domain is known:
    PFE0605c   


    PTHR11132 - PTHR11132 (Panther link)

    Proteins where this domain is known:
    PFB0535w    PFE0410w    PFE1510c    PFL0890c   


    PTHR11134 - PTHR11134 (Panther link)

    Proteins where this domain is known:
    MAL7P1.164    PFE1400c   


    PTHR11134:SF3 - ADAPTER-RELATED PROTEIN COMPLEX 1, BETA SUBUNIT (Panther link)

    Proteins where this domain is known:
    PFE1400c   


    PTHR11134:SF4 - PTHR11134:SF4 (Panther link)

    Proteins where this domain is known:
    MAL7P1.164   


    PTHR11135 - PTHR11135 (Panther link)

    Proteins where this domain is known:
    PFL1345c   


    PTHR11136 - Fpolygl_synthtse (Panther link)

    Interpro entry IPR001645 : Folylpolyglutamate synthetase (Interpro link)

    Interpro description:

    Folylpolyglutamate synthase(FPGS) is responsible for the addition of a polyglutamate tail to folate and folate derivatives, is an ATP-dependent enzyme isolated from eukaryotic and bacterial sources, where it plays a key role in the retention of the intracellular folate pool Its sequence is moderately conserved between prokaryotes (gene folC) and eukaryotes.

    FPGS belongs to a protein family that contains a number of related peptidoglycan synthetases (Mur)(see.

    A crystal structure of the MgATP complex of the enzyme from Lactobacillus casei reveals that folylpolyglutamate synthetase is a modular protein consisting of two domains, one with a typical mononucleotide-binding fold and the other strikingly similar to the folate-binding enzyme dihydrofolate reductase. The active site of the enzyme is located in a large interdomain cleft adjacent to an ATP-binding P-loop motif. Opposite this site, in the C domain, a cavity likely to be the folate binding site has been identified, and inspection of this cavity and the surrounding protein structure suggests that the glutamate tail of the substrate may project into the active site. A further feature of the structure is a well defined Omega loop, which contributes both to the active site and to interdomain interactions.

    Proteins where this domain is known:
    PF13_0140   


    PTHR11138 - Met_tRNA_Form_TA-like (Panther link)

    Interpro entry IPR015518 : (Interpro link)

    Interpro description:

    Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. This protein also includes formyl tetrahydrofolate dehydrogenases, which produce formate from formyl-tetrahydrofolate. These enzymes contain an N-terminal domain in common with other formyl transferase enzymes. The C-terminal domain has an open beta-barrel fold.

    Proteins where this domain is known:
    MAL13P1.67   


    PTHR11140 - PTHR11140 (Panther link)

    Proteins where this domain is known:
    PFD0265w   


    PTHR11141 - PTHR11141 (Panther link)

    Proteins where this domain is known:
    PF08_0036   


    PTHR11142 - PseudoU_synth_1 (Panther link)

    Interpro entry IPR001406 : tRNA pseudouridine synthase (Interpro link)

    Interpro description:
    Transfer RNA-pseudouridine synthetase contains one atom of zinc essential for its native conformation and tRNA recognition and has a strictly conserved aspartic acid that is likely to be involved in catalysis. It is involved in the formation of pseudouridine at positions 38, 39 and 40 in the anticodon stem and loop of transfer-RNAs. Pseudouridine is the most abundant modified nucleoside found in all cellular RNAs.

    Proteins where this domain is known:
    PF08_0123    PFE0815w    PFI0420c   


    PTHR11143 - Ribosomal_L26e/a (Panther link)

    Interpro entry IPR005756 : Ribosomal protein L26, eukaryotic/archaeal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. In their mature form, these proteins have 103 to 150 amino-acid residues. This entry represents the archaeal and eukaryotic branch of these proteins, known as the L26 family.

    Proteins where this domain is known:
    PFC0535w   


    PTHR11143:SF2 - PTHR11143:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0535w   


    PTHR11157 - GNS1_SUR4 (Panther link)

    Interpro entry IPR002076 : GNS1/SUR4 membrane protein (Interpro link)

    Interpro description:

    This group of eukaryotic integral membrane proteins are evolutionary related, but exact function has not yet clearly been established. The proteins have from 290 to 435 amino acid residues. Structurally, they seem to be formed of three sections: a N-terminal region with two transmembrane domains, a central hydrophilic loop and a C-terminal region that contains from one to three transmembrane domains. Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signalling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1.

    Proteins where this domain is known:
    PFA0455c    PFF0290w    PFI0980w   


    PTHR11158 - PTHR11158 (Panther link)

    Proteins where this domain is known:
    PF13_0138   


    PTHR11164 - GCS (Panther link)

    Interpro entry IPR004308 : Glutamate-cysteine ligase catalytic subunit (Interpro link)

    Interpro description:
    This family represents the catalytic subunit of glutamate-cysteine ligase, also known as gamma-glutamylcysteine synthetase (GCS). This enzyme catalyses the rate limiting step in the biosynthesis of glutathione. The eukaryotic enzyme is a dimer of a heavy chain and a light chain with all the catalytic activity exhibited by the heavy chain.

    Proteins where this domain is known:
    PFI0925w   


    PTHR11165 - Skp1 (Panther link)

    Interpro entry IPR001232 : SKP1 component (Interpro link)

    Interpro description:

    SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex and is also involved in the ubiquitin pathway. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1.

    Proteins where this domain is known:
    MAL13P1.337   


    PTHR11165:SF13 - PTHR11165:SF13 (Panther link)

    Proteins where this domain is known:
    MAL13P1.337   


    PTHR11177 - CHITINASE (Panther link)

    Proteins where this domain is known:
    PFL2510w   


    PTHR11178 - PTHR11178 (Panther link)

    Proteins where this domain is known:
    PFI1050c    PFI1835c   


    PTHR11193 - PTHR11193 (Panther link)

    Proteins where this domain is known:
    MAL13P1.253   


    PTHR11200 - PTHR11200 (Panther link)

    Proteins where this domain is known:
    MAL8P1.151    PF07_0024    PF11_0122    PF13_0285   


    PTHR11200:SF10 - PTHR11200:SF10 (Panther link)

    Proteins where this domain is known:
    PF07_0024   


    PTHR11200:SF11 - PTHR11200:SF11 (Panther link)

    Proteins where this domain is known:
    PF13_0285   


    PTHR11200:SF13 - PTHR11200:SF13 (Panther link)

    Proteins where this domain is known:
    PF11_0122   


    PTHR11200:SF9 - PTHR11200:SF9 (Panther link)

    Proteins where this domain is known:
    MAL8P1.151   


    PTHR11201 - PTHR11201 (Panther link)

    Proteins where this domain is known:
    PFC1045c    PFF1120c    PFI0900w   


    PTHR11201:SF198 - gb def: ENSANGP00000025551 (Fragment) (Panther link)

    Proteins where this domain is known:
    PFI0900w   


    PTHR11201:SF245 - PTHR11201:SF245 (Panther link)

    Proteins where this domain is known:
    PFF1120c   


    PTHR11203 - PTHR11203 (Panther link)

    Proteins where this domain is known:
    MAL8P1.64    PF10_0089    PF14_0364    PFC0825c   


    PTHR11203:SF1 - PTHR11203:SF1 (Panther link)

    Proteins where this domain is known:
    PF10_0089   


    PTHR11203:SF11 - PTHR11203:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0364   


    PTHR11203:SF5 - PTHR11203:SF5 (Panther link)

    Proteins where this domain is known:
    MAL8P1.64   


    PTHR11203:SF8 - CLEAVAGE AND POLYADENYLATION SPECIFICITY FACTOR-RELATED (Panther link)

    Proteins where this domain is known:
    PFC0825c   


    PTHR11205 - Ribosomal_S7 (Panther link)

    Interpro entry IPR000235 : Ribosomal protein S7 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.

    Proteins where this domain is known:
    PF07_0088   


    PTHR11205:SF1 - Ribosomal_S7e/a (Panther link)

    Interpro entry IPR005716 : Ribosomal protein S7, eukaryotic/archaeal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family describes the members from the eukaryotic cytosol and the Archaea of the family that includes ribosomal protein S7 of bacteria and S5 of eukaryotes.

    Proteins where this domain is known:
    PF07_0088   


    PTHR11206 - MatE (Panther link)

    Interpro entry IPR002528 : Multi antimicrobial extrusion protein MatE (Interpro link)

    Interpro description:

    Characterised members of the Multi Antimicrobial Extrusion (MATE) family function as drug/sodium antiporters. These proteins mediate resistance to a wide range of cationic dyes, fluroquinolones, aminoglycosides and other structurally diverse antibodies and drugs. MATE proteins are found in bacteria, archaea and eukaryotes. These proteins are predicted to have 12 alpha-helical transmembrane regions, some of the animal proteins may have an additional C-terminal helix.

    Proteins where this domain is known:
    PFB0580w   


    PTHR11208 - PTHR11208 (Panther link)

    Proteins where this domain is known:
    PFF1135w   


    PTHR11208:SF6 - PTHR11208:SF6 (Panther link)

    Proteins where this domain is known:
    PFF1135w   


    PTHR11210 - PTHR11210 (Panther link)

    Proteins where this domain is known:
    PFC0845c    PFF1180w   


    PTHR11210:SF2 - PTHR11210:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0845c   


    PTHR11215 - Met-dep_prot_hydro (Panther link)

    Interpro entry IPR003226 : (Interpro link)

    Interpro description:
    The function of this domain is not known, but it is found in several uncharacterised proteins and a probable metal dependent protein hydrolase.

    Proteins where this domain is known:
    PFF1295w   


    PTHR11216 - PTHR11216 (Panther link)

    Proteins where this domain is known:
    PF10_0244    PFC0190c   


    PTHR11216:SF11 - PTHR11216:SF11 (Panther link)

    Proteins where this domain is known:
    PF10_0244   


    PTHR11216:SF31 - PTHR11216:SF31 (Panther link)

    Proteins where this domain is known:
    PFC0190c   


    PTHR11223 - PTHR11223 (Panther link)

    Proteins where this domain is known:
    PFC0135c   


    PTHR11223:SF1 - PTHR11223:SF1 (Panther link)

    Proteins where this domain is known:
    PFC0135c   


    PTHR11227 - PTHR11227 (Panther link)

    Proteins where this domain is known:
    PF10_0126   


    PTHR11227:SF18 - PTHR11227:SF18 (Panther link)

    Proteins where this domain is known:
    PF10_0126   


    PTHR11229 - PTHR11229 (Panther link)

    Proteins where this domain is known:
    PFI0890c    PFL2180w   


    PTHR11236 - TRPE_1_chor_bd (Panther link)

    Interpro entry IPR005801 : Anthranilate synthase component I and chorismate binding protein (Interpro link)

    Interpro description:
    This entry represents the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase. Anthranilate synthase catalyses the reaction:
     chorismate + l-glutamine =  anthranilate + pyruvate + l-glutamate. 
    The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity

    Proteins where this domain is known:
    PFI1100w   


    PTHR11236:SF7 - PTHR11236:SF7 (Panther link)

    Proteins where this domain is known:
    PFI1100w   


    PTHR11239 - PTHR11239 (Panther link)

    Proteins where this domain is known:
    PFA0505c    PFB0290c    PFD0360w   


    PTHR11239:SF1 - PTHR11239:SF1 (Panther link)

    Proteins where this domain is known:
    PFA0505c   


    PTHR11239:SF2 - PTHR11239:SF2 (Panther link)

    Proteins where this domain is known:
    PFB0290c   


    PTHR11239:SF3 - PTHR11239:SF3 (Panther link)

    Proteins where this domain is known:
    PFD0360w   


    PTHR11241 - PTHR11241 (Panther link)

    Proteins where this domain is known:
    PF11_0282   


    PTHR11246 - PTHR11246 (Panther link)

    Proteins where this domain is known:
    PF11_0108    PFD0180c    PFL1735c   


    PTHR11246:SF1 - PTHR11246:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0108   


    PTHR11246:SF2 - PTHR11246:SF2 (Panther link)

    Proteins where this domain is known:
    PFL1735c   


    PTHR11246:SF3 - PTHR11246:SF3 (Panther link)

    Proteins where this domain is known:
    PFD0180c   


    PTHR11247 - PALMITOYL-PROTEIN THIOESTERASE/DOLICHYLDIPHOSPHATASE 1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.202   


    PTHR11247:SF1 - DOLICHYLDIPHOSPHATASE 1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.202   


    PTHR11254 - PTHR11254 (Panther link)

    Proteins where this domain is known:
    MAL7P1.19    MAL8P1.23    PF11_0201    PFF1365c    PFL0975w   


    PTHR11254:SF20 - PTHR11254:SF20 (Panther link)

    Proteins where this domain is known:
    MAL7P1.19   


    PTHR11254:SF67 - PTHR11254:SF67 (Panther link)

    Proteins where this domain is known:
    MAL8P1.23   


    PTHR11254:SF70 - PTHR11254:SF70 (Panther link)

    Proteins where this domain is known:
    PF11_0201   


    PTHR11254:SF9 - PTHR11254:SF9 (Panther link)

    Proteins where this domain is known:
    PFL0975w   


    PTHR11255 - PTHR11255 (Panther link)

    Proteins where this domain is known:
    PF14_0681    PFI1485c   


    PTHR11255:SF1 - PTHR11255:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0681   


    PTHR11255:SF4 - PTHR11255:SF4 (Panther link)

    Proteins where this domain is known:
    PFI1485c   


    PTHR11260 - PTHR11260 (Panther link)

    Proteins where this domain is known:
    PF13_0214   


    PTHR11260:SF7 - PTHR11260:SF7 (Panther link)

    Proteins where this domain is known:
    PF13_0214   


    PTHR11262 - PTHR11262 (Panther link)

    Proteins where this domain is known:
    PFI0355c   


    PTHR11262:SF3 - HslU (Panther link)

    Interpro entry IPR004491 : Heat shock protein HslU (Interpro link)

    Interpro description:

    This family of proteins represent HslU, a bacterial clpX homolog, which is an ATPase and chaperone belonging to the AAA Clp/Hsp100 family and a component of the eubacterial proteasome.

    ATP-dependent protease complexes are present in all three kingdoms of life, where they rid the cell of misfolded or damaged proteins and control the level of certain regulatory proteins. They include the proteasome in Eukaryotes, Archaea, and Actinomycetales and the HslVU (ClpQY, ClpXP) complex in other eubacteria. Genes homologous to eubacterial HslV (ClpQ,) and HslU (ClpY, ClpX) have also been demonstrated in to be present in the genome of trypanosomatid protozoa. They are expressed as precursors, with a propeptide that is removed to produce the active protease. The protease is probably located in the kinetoplast (mitochondrion). Phylogenetic analysis shows that HslV and HslU from trypanosomatids form a single clad with other eubacterial homologs.

    Proteins where this domain is known:
    PFI0355c   


    PTHR11264 - UDNA_glycsylse (Panther link)

    Interpro entry IPR002043 : Uracil-DNA glycosylase (Interpro link)

    Interpro description:

    Uracil-DNA glycosylase(UNG) is a DNA repair enzyme that excises uracil residues from DNA by cleaving the N-glycosylic bond. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or deamination of cytosine. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localisation , but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.

    Proteins where this domain is known:
    PF14_0148   


    PTHR11265 - Bact_methyltrans (Panther link)

    Interpro entry IPR002903 : Bacterial methyltransferase (Interpro link)

    Interpro description:

    This is a family of methyltransferases, so called because they are responsible for the transfer of methyl groups between molecules. Despite its name, it does not occur solely in bacteria. This protein is essential in Escherichia coli and has been linked to peptidoglycan biosynthesis.

    Proteins where this domain is known:
    PFL1775c   


    PTHR11266 - Mpv17_PMP22 (Panther link)

    Interpro entry IPR007248 : Mpv17/PMP22 (Interpro link)

    Interpro description:

    The 22 kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore-forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein involved in the development of early-onset glomerulosclerosis.

    A member of this family found in Saccharomyces cerevisiae (Baker's yeast) is an integral membrane protein of the inner mitochondrial membrane and has been suggested to play a role in mitochondrial function during heat shock.

    Proteins where this domain is known:
    PFE0110w   


    PTHR11274 - PTHR11274 (Panther link)

    Proteins where this domain is known:
    PF10_0369   


    PTHR11278 - Ribosomal_S7E (Panther link)

    Interpro entry IPR000554 : Ribosomal protein S7e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of Xenopus S8, and mammalian, insect and yeast S7. These proteins have about 200 amino acids.

    Proteins where this domain is known:
    PF13_0014   


    PTHR11295 - PTHR11295 (Panther link)

    Proteins where this domain is known:
    MAL13P1.185    MAL13P1.196    MAL13P1.279    MAL13P1.84    MAL7P1.144    MAL7P1.175    PF08_0044    PF10_0141    PF10_0160    PF11_0096    PF11_0147    PF11_0156    PF14_0294    PF14_0408    PF14_0431    PFA0130c    PFC0060c    PFC0105w    PFC0525c    PFC0755c    PFD0740w    PFD0865c    PFD1165w    PFF0750w    PFI0100c   


    PTHR11295:SF110 - CDK5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.279   


    PTHR11295:SF114 - GLYCOGEN SYNTHASE KINASE 3-RELATED (GSK3) (CMGC GROUP III) (Panther link)

    Proteins where this domain is known:
    MAL13P1.84    PF08_0044    PFC0525c   


    PTHR11295:SF118 - PTHR11295:SF118 (Panther link)

    Proteins where this domain is known:
    PFC0105w   


    PTHR11295:SF17 - PTHR11295:SF17 (Panther link)

    Proteins where this domain is known:
    PF11_0156   


    PTHR11295:SF21 - PTHR11295:SF21 (Panther link)

    Proteins where this domain is known:
    PF14_0408   


    PTHR11295:SF29 - PTHR11295:SF29 (Panther link)

    Proteins where this domain is known:
    PF14_0431   


    PTHR11295:SF37 - PTHR11295:SF37 (Panther link)

    Proteins where this domain is known:
    PF11_0096   


    PTHR11295:SF39 - PTHR11295:SF39 (Panther link)

    Proteins where this domain is known:
    PFF0750w   


    PTHR11295:SF59 - ITOGEN-ACTIVATED PROTEIN KINASE 2 (Panther link)

    Proteins where this domain is known:
    PF11_0147   


    PTHR11295:SF66 - PTHR11295:SF66 (Panther link)

    Proteins where this domain is known:
    PF14_0294   


    PTHR11295:SF72 - PTHR11295:SF72 (Panther link)

    Proteins where this domain is known:
    PFD0740w   


    PTHR11295:SF74 - CDK-RELATED PROTEIN KINASE-RELATED (Panther link)

    Proteins where this domain is known:
    MAL13P1.185    PF10_0141   


    PTHR11295:SF87 - PTHR11295:SF87 (Panther link)

    Proteins where this domain is known:
    PFD0865c   


    PTHR11311 - PTHR11311 (Panther link)

    Proteins where this domain is known:
    MAL8P1.45   


    PTHR11311:SF1 - PTHR11311:SF1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.45   


    PTHR11347 - PTHR11347 (Panther link)

    Proteins where this domain is known:
    MAL13P1.118    MAL13P1.119    PF14_0672    PFL0475w   


    PTHR11347:SF14 - PTHR11347:SF14 (Panther link)

    Proteins where this domain is known:
    MAL13P1.119   


    PTHR11347:SF15 - PTHR11347:SF15 (Panther link)

    Proteins where this domain is known:
    MAL13P1.118    PFL0475w   


    PTHR11347:SF16 - PTHR11347:SF16 (Panther link)

    Proteins where this domain is known:
    PF14_0672   


    PTHR11349 - Nuc_diP_kinase_core (Panther link)

    Interpro entry IPR001564 : Nucleoside diphosphate kinase, core (Interpro link)

    Interpro description:

    Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.

    In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.

    NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.

    NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.

    Proteins where this domain is known:
    PF13_0349    PFF0275c   


    PTHR11351 - PTHR11351 (Panther link)

    Proteins where this domain is known:
    PFE0555w   


    PTHR11351:SF14 - PTHR11351:SF14 (Panther link)

    Proteins where this domain is known:
    PFE0555w   


    PTHR11352 - Pr_cel_nuc_antig (Panther link)

    Interpro entry IPR000730 : Proliferating cell nuclear antigen, PCNA (Interpro link)

    Interpro description:

    Proliferating cell nuclear antigen (PCNA), or cyclin, is a non-histone acidic nuclear protein that plays a key role in the control of eukaryotic DNA replication. It acts as a co-factor for DNA polymerase delta, which is responsible for leading strand DNA replication. The sequence of PCNA is well conserved between plants and animals, indicating a strong selective pressure for structure conservation, and suggesting that this type of DNA replication mechanism is conserved throughout eukaryotes. In Saccharomyces cerevisiae (Baker's yeast), POL30, is associated with polymerase III, the yeast analog of polymerase delta.

    Homologues of PCNA have also been identified in the archaea (Euryarchaeota and Crenarchaeota) and in Paramecium bursaria Chlorella virus 1 (PBCV-1) and in nuclear polyhedrosis viruses.

    Proteins where this domain is known:
    PF13_0328    PFL1285c   


    PTHR11353 - Cpn60/TCP-1 (Panther link)

    Interpro entry IPR002423 : Chaperonin Cpn60/TCP-1 (Interpro link)

    Interpro description:

    Partially folded polypeptide chains, either newly made by ribosomes or emerging from mature proteins unfolded by stress, run the risk of aggregating with one another to the detriment of the organism. Folding of newly synthesised polypeptides in the crowded cellular environment requires the assistance of molecular chaperone proteins, such as the large bacterial chaperonins GroEL and GroES.

    GroEL and GroES prevent aggregation by encapsulating individual chains within the so-called 'Anfinsen cage' provided by the GroEL-GroES complex, where they can fold in isolation from one another. GroEL consists of two heptameric rings of identical ATPase subunits stacked back to back, containing a cage in each ring. Each subunit consists of three domains. The equatorial domain contains the nucleotide binding site and is connected by a flexible intermediate domain with the apical domain. The latter presents several hydrophobic amino-acid side chains at the top of the ring, orientated towards the cavity of the cage. These side chains are involved in binding either a partially folded polypeptide chain or a single molecule of GroES.

    The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins, which include 10 kDa and 60 kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions.

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The cpn10 and cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

    The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A. The TCP-1 family of proteins act as molecular chaperones for tubulin, actin and probably some other proteins. They are weakly, but significantly, related to the cpn60/groEL chaperonin family.

    Proteins where this domain is known:
    MAL13P1.283    PF10_0153    PF11_0331    PFB0635w    PFC0285c    PFC0350c    PFC0900w    PFF0430w    PFL1425w    PFL1545c   


    PTHR11353:SF19 - Chap_CCT_theta (Panther link)

    Interpro entry IPR012721 : T-complex protein 1, theta subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT theta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PFB0635w   


    PTHR11353:SF20 - Chap_CCT_alpha (Panther link)

    Interpro entry IPR012715 : T-complex protein 1, alpha subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacteria, GroEL/GroES. This family consists exclusively of the CCT alpha chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PF11_0331   


    PTHR11353:SF21 - Chap_CCT_zeta (Panther link)

    Interpro entry IPR012722 : T-complex protein 1, zeta subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT zeta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PFF0430w   


    PTHR11353:SF22 - Chap_CCT_eta (Panther link)

    Interpro entry IPR012720 : T-complex protein 1, eta subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT eta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PFC0350c   


    PTHR11353:SF23 - Chap_CCT_beta (Panther link)

    Interpro entry IPR012716 : T-complex protein 1, beta subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT beta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PFC0285c   


    PTHR11353:SF24 - Chap_CCT_gamma (Panther link)

    Interpro entry IPR012719 : T-complex protein 1, gamma subunit (Interpro link)

    Interpro description:

    The TCP-1 protein (Tailless Complex Polypeptide 1) was first identified in mice where it is especially abundant in testis but present in all cell types. It has since been found and characterised in many other animal species, as well as in yeast, plants and protists. TCP-1 is a highly conserved protein of about 60 kDa (556 to 560 residues) which participates in a hetero-oligomeric 900 kDa double-torus shaped particle with 6 to 8 other different subunits. These subunits, the chaperonin containing TCP-1 (CCT) subunit beta, gamma, delta, epsilon, zeta and eta are evolutionary related to TCP-1 itself. The CCT is known to act as a molecular chaperone for tubulin, actin and probably some other proteins.

    The TCP-1 family of proteins are weakly, but significantly, related to the cpn60/groEL chaperonin family.

    Proteins in this entry consist exclusively of the CCT gamma chain from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PFL1425w   


    PTHR11353:SF25 - Chap_CCT_epsi (Panther link)

    Interpro entry IPR012718 : T-complex protein 1, epsilon subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT epsilon chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    PFC0900w   


    PTHR11353:SF26 - Chap_CCT_delta (Panther link)

    Interpro entry IPR012717 : T-complex protein 1, delta subunit (Interpro link)

    Interpro description:

    Members of this eukaryotic family are part of the group II chaperonin complex called CCT (chaperonin containing TCP-1) or TRiC. The archaeal equivalent group II chaperonin is often called the thermosome. Both are somewhat related to the group I chaperonin of bacterial, GroEL/GroES. This family consists exclusively of the CCT delta chain (part of a paralogous family) from animals, plants, fungi, and other eukaryotes.

    Proteins where this domain is known:
    MAL13P1.283   


    PTHR11353:SF5 - PTHR11353:SF5 (Panther link)

    Proteins where this domain is known:
    PFL1545c   


    PTHR11353:SF9 - PTHR11353:SF9 (Panther link)

    Proteins where this domain is known:
    PF10_0153   


    PTHR11358 - Ureohydrolase (Panther link)

    Interpro entry IPR006035 : Ureohydrolase (Interpro link)

    Interpro description:

    The ureohydrolase superfamily includes arginase, agmatinase, formiminoglutamase and proclavaminate amidinohydrolase. These enzymes share a 3-layer alpha-beta-alpha structure, and play important roles in arginine/agmatine metabolism, the urea cycle, histidine degradation, and other pathways.

    Arginase, which catalyses the conversion of arginine to urea and ornithine, is one of the five members of the urea cycle enzymes that convert ammonia to urea as the principal product of nitrogen excretion. There are several arginase isozymes that differ in catalytic, molecular and immunological properties. Deficiency in the liver isozyme leads to argininemia, which is usually associated with hyperammonemia.

    Agmatinase hydrolyses agmatine to putrescine, the precursor for the biosynthesis of higher polyamines, spermidine and spermine. In addition, agmatine may play an important regulatory role in mammals.

    Formiminoglutamase catalyses the fourth step in histidine degradation, acting to hydrolyse N-formimidoyl-L-glutamate to L-glutamate and formamide.

    Proclavaminate amidinohydrolase is involved in clavulanic acid biosynthesis. Clavulanic acid acts as an inhibitor of a wide range of beta-lactamase enzymes that are used by various microorganisms to resist beta-lactam antibiotics. As a result, this enzyme improves the effectiveness of beta-lactamase antibiotics.

    Proteins where this domain is known:
    PFI0320w   


    PTHR11358:SF2 - Arginase_sub (Panther link)

    Interpro entry IPR014033 : Arginase, subgroup (Interpro link)

    Interpro description:

    L-Arginine is converted to nitric oxide and citrulline by the enzyme nitric oxide synthase and by the enzyme arginase as a part of the hepatic urea cycle. Arginase is a manganese metalloenzyme containing a metal-activated hydroxide ion, a critical nucleophile in metalloenzymes that catalyze hydrolysis or hydration reactions. A hydrogen bond formed by the metal-bound hydroxide holds the enzyme in the proper orientation for catalysis however non-metal substrate-binding sites are also implicated in the enzyme mechanism. Regeneration of metal-bound hydroxide ion from a metal-bound water molecule requires proton transfer to bulk solvent mediated by a histidine proton shuttle residue.

    Proteins where this domain is known:
    PFI0320w   


    PTHR11359 - PTHR11359 (Panther link)

    Proteins where this domain is known:
    MAL13P1.146   


    PTHR11360 - PTHR11360 (Panther link)

    Proteins where this domain is known:
    PFB0465c    PFI1295c   


    PTHR11360:SF3 - PTHR11360:SF3 (Panther link)

    Proteins where this domain is known:
    PFB0465c    PFI1295c   


    PTHR11361 - MutS_C (Panther link)

    Interpro entry IPR000432 : DNA mismatch repair protein MutS, C-terminal (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the C-terminal region found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.

    Proteins where this domain is known:
    MAL7P1.206    PF14_0254    PFE0270c   


    PTHR11361:SF29 - PTHR11361:SF29 (Panther link)

    Proteins where this domain is known:
    MAL7P1.206    PF14_0254   


    PTHR11361:SF31 - MutS_Hmlg_MSH6 (Panther link)

    Interpro entry IPR015536 : (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    MSH6 is an ATPase that is part of the MSH2-MSH6 complex and has been shown in Homo sapiens (Human) to bind to mismatched DNA directly in the ADP-bound state. The MSH6 has members from yeasts, plants, fish, and mammals. After DNA replication, a low level of replication errors exist, including base-pair mismatches. In bacteria, it was shown that MutS acts with MutL, MutH, and UvrD to correct these errors. In Human, it was shown that MSH2 and MSH6 are involved in the BASC complex (BRCA1-associated genome surveillance complex) with many other proteins including MLH1, RAD50, MRE11, NBS1, RFC1, RFC2, RFC4, BRCA1, ATM, and BLM. In Human, mutations in MSH6 have been shown to be associated with hereditary nonpolyposis colon cancer and endometrial cancer.

    Proteins where this domain is known:
    PFE0270c   


    PTHR11362 - PHOSPHATIDYLETHANOLAMINE-BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    PFL0955c   


    PTHR11363 - PTHR11363 (Panther link)

    Proteins where this domain is known:
    PF10_0272   


    PTHR11364 - PTHR11364 (Panther link)

    Proteins where this domain is known:
    PFL0320w   


    PTHR11373 - PTHR11373 (Panther link)

    Proteins where this domain is known:
    PFL1890c   


    PTHR11373:SF7 - PTHR11373:SF7 (Panther link)

    Proteins where this domain is known:
    PFL1890c   


    PTHR11375 - PTHR11375 (Panther link)

    Proteins where this domain is known:
    PF14_0257   


    PTHR11377 - Myristoyl_trans (Panther link)

    Interpro entry IPR000903 : Myristoyl-CoA:protein N-myristoyltransferase (Interpro link)

    Interpro description:
    Myristoyl-CoA:protein N-myristoyltransferase (Nmt) is the enzyme responsible for transferring a myristate group on the N-terminal glycine of a number of cellular eukaryotics and viral proteins. Nmt is a monomeric protein of about 50 to 60 kD whose sequence appears to be well conserved.

    Proteins where this domain is known:
    PF14_0127   


    PTHR11387 - PTHR11387 (Panther link)

    Proteins where this domain is known:
    PFI1375w   


    PTHR11390 - Topo_IA (Panther link)

    Interpro entry IPR000380 : DNA topoisomerase, type IA, core (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry describes the core region of type IA topoisomerases, which are highly conserved enzymes that are structurally distinct from type IB enzymes. The structures of both topoisomerases I and III have been elucidated, and consist of four domains that together form a toroidal molecule with a central hole that is large enough to accommodate single- and double-stranded DNA. It is believed that the domains transiently separate from one another to allow the entrance and exit of DNA strands.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF13_0251   


    PTHR11404 - SODismutase (Panther link)

    Interpro entry IPR001189 : Manganese and iron superoxide dismutase (Interpro link)

    Interpro description:

    Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.

    The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).

    Proteins where this domain is known:
    PF08_0071    PFF1130c   


    PTHR11404:SF8 - SUPEROXIDE DISMUTASE [FE] (Panther link)

    Proteins where this domain is known:
    PF08_0071    PFF1130c   


    PTHR11405 - CARBAMOYLTRANSFERASE RELATED (Panther link)

    Proteins where this domain is known:
    MAL13P1.221    PF13_0044   


    PTHR11405:SF3 - PTHR11405:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0044   


    PTHR11406 - PGK (Panther link)

    Interpro entry IPR001576 : Phosphoglycerate kinase (Interpro link)

    Interpro description:

    Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.

    PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.

    Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.

    This entry represents the full PGK enzyme.

    Proteins where this domain is known:
    PFI1105w   


    PTHR11409 - A/AMP_deaminase (Panther link)

    Interpro entry IPR001365 : Adenosine/AMP deaminase (Interpro link)

    Interpro description:

    Adenosine deaminase catalyzes the hydrolytic deamination of adenosine into inosine and AMP deaminase catalyzes the hydrolytic deamination of AMP into IMP. It has been shown that these two enzymes share three regions of sequence similarities; these regions are centred on residues which are proposed to play an important role in the catalytic mechanism of these two enzymes.

    Proteins where this domain is known:
    PF10_0289   


    PTHR11409:SF21 - ADENOSINE DEAMINASE (Panther link)

    Proteins where this domain is known:
    PF10_0289   


    PTHR11426 - Histone_H3 (Panther link)

    Interpro entry IPR000164 : Histone H3 (Interpro link)

    Interpro description:

    Histone H3 is one of the four histones, along with H2A, H2B and H4, which form the eukaryotic nucleosome octomer core; the nucleosome octamer winds ~146 DNA base-pairs. It is a highly conserved protein of 135 amino acid residues.

    Several proteins have been found to contain a C-terminal H3-like domain, including the mammalian centromeric protein CENP-A (which may act as a core histone necessary for the assembly of centromeres); yeast chromatin- associated protein CSE4; and Caenorhabditis elegans chromosome III proteins YL82_CAEEL and YMH3_CAEEL, whose function is unknown.

    Proteins where this domain is known:
    PF13_0185    PFF0510w    PFF0865w   


    PTHR11426:SF6 - PTHR11426:SF6 (Panther link)

    Proteins where this domain is known:
    PF13_0185   


    PTHR11440 - LACT (Panther link)

    Interpro entry IPR003386 : Lecithin:cholesterol acyltransferase (Interpro link)

    Interpro description:
    Lecithin:cholesterol acyltransferase (LACT) also known as phosphatidylcholine-sterol acyltransferase, is involved in extracellular metabolism of plasma lipoproteins, including cholesterol. It esterifies the free cholesterol transported in plasma lipoproteins, and is activated by apolipoprotein A-I. Defects in LACT cause Norum and Fish eye diseases.

    Proteins where this domain is known:
    PFF1420w   


    PTHR11440:SF5 - PTHR11440:SF5 (Panther link)

    Proteins where this domain is known:
    PFF1420w   


    PTHR11444 - ASPARTATEAMMONIA/ARGININOSUCCINATE/ADENYLOSUCCINATE LYASE (Panther link)

    Proteins where this domain is known:
    PFB0295w   


    PTHR11444:SF2 - ADENYLOSUCCINATE LYASE (Panther link)

    Proteins where this domain is known:
    PFB0295w   


    PTHR11449 - Ribosomal_L30e (Panther link)

    Interpro entry IPR000231 : Ribosomal protein L30e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic, bacterial and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:

  • Mammalian L30
  • Leishmania major L30
  • Yeast YL32 .
  • Bacillus subtilis hypothetical protein ybxF
  • Thermococcus celer L30
  • A probable ribosomal protein (ORF 1) from Methanococcus vannielii
  • A probable ribosomal protein (ORF 104) from Sulfolobus acidocaldarius
  • These proteins, of the L30e family, have 82 to 114 amino-acid residues.

    Proteins where this domain is known:
    PF10_0187   


    PTHR11449:SF1 - PTHR11449:SF1 (Panther link)

    Proteins where this domain is known:
    PF10_0187   


    PTHR11451 - PTHR11451 (Panther link)

    Proteins where this domain is known:
    PF11_0270    PFI1240c    PFL0670c   


    PTHR11451:SF5 - PTHR11451:SF5 (Panther link)

    Proteins where this domain is known:
    PF11_0270   


    PTHR11451:SF6 - ProS_fam_I (Panther link)

    Interpro entry IPR004499 : Prolyl-tRNA synthetase, class IIa, prokaryotic-type (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Prolyl-tRNA synthetase is a class II tRNA synthetase and is recognized by which recognises tRNA synthetases for Gly, His, Ser, and Pro. The prolyl-tRNA synthetases are divided into two widely divergent families. This family includes the archaeal enzyme, the Pro-specific domain of a human multifunctional tRNA ligase, and the enzyme from the spirochete Borrelia burgdorferi (Lyme desease spirochete). The other family includes enzymes from Escherichia coli, Bacillus subtilis, Synechocystis sp. (strain PCC 6308), and one of the two prolyl-tRNA synthetases of Saccharomyces cerevisiae (Baker's yeast).

    Proteins where this domain is known:
    PFL0670c   


    PTHR11451:SF7 - PTHR11451:SF7 (Panther link)

    Proteins where this domain is known:
    PFI1240c   


    PTHR11458 - AlaD_dehydratase (Panther link)

    Interpro entry IPR001731 : Tetrapyrrole biosynthesis, porphobilinogen synthase (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.

    This entry represents porphobilinogen (PBG) synthase (PBGS, or 5-aminoaevulinic acid dehydratase, or ALAD), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses a Knorr-type condensation reaction between two molecules of ALA to generate porphobilinogen, the pyrrolic building block used in later steps. The structure of the enzyme is based on a TIM barrel topology made up of eight identical subunits, where each subunit binds to a metal ion that is essential for activity, usually zinc (in yeast, mammals and certain bacteria) or magnesium (in plants and other bacteria). A lysine has been implicated in the catalytic mechanism. The lack of PBGS enzyme causes a rare porphyric disorder known as ALAD porphyria, which appears to involve conformational changes in the enzyme.

    Proteins where this domain is known:
    PF14_0381   


    PTHR11469 - G6P_Isomerase (Panther link)

    Interpro entry IPR001672 : Phosphoglucose isomerase (PGI) (Interpro link)

    Interpro description:

    Phosphoglucose isomerase (PGI) is a dimeric enzyme that catalyses the reversible isomerization of glucose-6-phosphate and fructose-6-phosphate. PGI is involved in different pathways: in most higher organisms it is involved in glycolysis; in mammals it is involved in gluconeogenesis; in plants in carbohydrate biosynthesis; in some bacteria it provides a gateway for fructose into the Entner-Doudouroff pathway. The multifunctional protein, PGI, is also known as neuroleukin (a neurotrophic factor that mediates the differentiation of neurons), autocrine motility factor (a tumour-secreted cytokine that regulates cell motility), differentiation and maturation mediator and myofibril-bound serine proteinase inhibitor, and has different roles inside and outside the cell. In the cytoplasm, it catalyses the second step in glycolysis, while outside the cell it serves as a nerve growth factor and cytokine.

    PGI from Bacillus stearothermophilus has an open twisted alpha/beta structural motif consisting of two globular domains and two protruding parts. It has been suggested that the top part of the large domain together with one of the protruding loops might participate in inducing the neurotrophic activity. The structure of rabbit muscle phosphoglucose isomerase complexed with various inhibitors shows that the enzyme is a dimer with two alpha/beta-sandwich domains in each subunit. The location of the bound D-gluconate 6-phosphate inhibitor leads to the identification of residues involved in substrate specificity. In addition, the positions of amino acid residues that are substituted in the genetic disease nonspherocytic hemolytic anemia suggest how these substitutions can result in altered catalysis or protein stability.

    Proteins where this domain is known:
    PF14_0341   


    PTHR11472 - PTHR11472 (Panther link)

    Proteins where this domain is known:
    MAL13P1.134    PF14_0081    PFI1650w   


    PTHR11472:SF1 - PTHR11472:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1650w   


    PTHR11472:SF4 - PTHR11472:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0081   


    PTHR11472:SF5 - PTHR11472:SF5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.134   


    PTHR11476 - His-tRNA_synth (Panther link)

    Interpro entry IPR004516 : Histidyl-tRNA synthetase, class IIa (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Histidyl-tRNA synthetase is an alpha2 dimer that belongs to class IIa. Every completed genome includes a histidyl-tRNA synthetase. Apparent second copies from Bacillus subtilis, Synechocystis sp. (strain PCC 6803), and Aquifex aeolicus are slightly shorter, more closely related to each other than to other hisS proteins, and not demonstrated to act as histidyl-tRNA synthetases (see. The regulatory protein kinase GCN2 of Saccharomyces cerevisiae (YDR283c), and related proteins from other species designated eIF-2 alpha kinase, have a domain closely related to histidyl-tRNA synthetase that may serve to detect and respond to uncharged tRNA(his), an indicator of amino acid starvation, but these regulatory proteins are not orthologous.

    Proteins where this domain is known:
    PF14_0428    PFI1645c   


    PTHR11477 - PTHR11477 (Panther link)

    Proteins where this domain is known:
    PF07_0057   


    PTHR11482 - PTHR11482 (Panther link)

    Proteins where this domain is known:
    PF10_0322   


    PTHR11482:SF4 - PTHR11482:SF4 (Panther link)

    Proteins where this domain is known:
    PF10_0322   


    PTHR11489 - Ribosomal_S2_e/a (Panther link)

    Interpro entry IPR005707 : Ribosomal protein S2, eukaryotic/archaeal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family describes the ribosomal protein of the eukaryotic cytosol and of Archaea, homologous to S2 of bacteria. It is designated typically as Sa in eukaryotes and Sa or S2 in the archaea.

    Proteins where this domain is known:
    PF10_0264   


    PTHR11489:SF7 - PTHR11489:SF7 (Panther link)

    Proteins where this domain is known:
    PF10_0264   


    PTHR11502 - Ribosomal_S6E (Panther link)

    Interpro entry IPR001377 : Ribosomal protein S6e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins have been grouped on the basis of sequence similarities. Ribosomal protein S6 is the major substrate of protein kinases in eukaryotic ribosomes and may play an important role in controlling cell growth and proliferation through the selective translation of particular classes of mRNA.

    Proteins where this domain is known:
    PF13_0228   


    PTHR11502:SF1 - PTHR11502:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0228   


    PTHR11510 - Inos-1-P_synth (Panther link)

    Interpro entry IPR002587 : Myo-inositol-1-phosphate synthase (Interpro link)

    Interpro description:

    1L-myo-Inositol-1-phosphate synthase catalyzes the conversion of D-glucose 6-phosphate to 1L-myo-inositol-1-phosphate, the first committed step in the production of all inositol-containing compounds, including phospholipids, either directly or by salvage. The enzyme exists in a cytoplasmic form in a wide range of plants, animals, and fungi. It has also been detected in several bacteria and a chloroplast form is observed in alga and higher plants. Inositol phosphates play an important role in signal transduction.

    In Saccharomyces cerevisiae (Baker's yeast), the transcriptional regulation of the INO1 gene has been studied in detail and its expression is sensitive to the availability of phospholipid precursors as well as growth phase. The regulation of the structural gene encoding 1L-myo-inositol-1-phosphate synthase has also been analyzed at the transcriptional level in the aquatic angiosperm, Spirodela polyrrhiza (Giant duckweed) and the halophyte, Mesembryanthemum crystallinum (Common ice plant).

    Proteins where this domain is known:
    PFE0585c   


    PTHR11516 - PTHR11516 (Panther link)

    Proteins where this domain is known:
    PF11_0256    PF13_0070   


    PTHR11516:SF1 - PTHR11516:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0070   


    PTHR11516:SF5 - PYRUVATE DEHYDROGENASE E1 COMPONENT, ALPHA SUBUNIT (Panther link)

    Proteins where this domain is known:
    PF11_0256   


    PTHR11517 - Ribosomal_L37ae (Panther link)

    Interpro entry IPR002674 : Ribosomal protein L37ae (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This ribosomal protein is found in archaebacteria and eukaryotes. Ribosomal protein L37 has a single zinc finger-like motif of the C2-C2 type.

    Proteins where this domain is known:
    PFB0455w   


    PTHR11517:SF1 - PTHR11517:SF1 (Panther link)

    Proteins where this domain is known:
    PFB0455w   


    PTHR11524 - PTHR11524 (Panther link)

    Proteins where this domain is known:
    PFC0300c   


    PTHR11525 - PTHR11525 (Panther link)

    Proteins where this domain is known:
    PF11_0295   


    PTHR11527 - PTHR11527 (Panther link)

    Proteins where this domain is known:
    MAL8P1.78    PF13_0021    PFL0550w   


    PTHR11527:SF15 - PTHR11527:SF15 (Panther link)

    Proteins where this domain is known:
    MAL8P1.78    PF13_0021    PFL0550w   


    PTHR11528 - Hsp90 (Panther link)

    Interpro entry IPR001404 : Heat shock protein Hsp90 (Interpro link)

    Interpro description:

    Prokaryotes and eukaryotes respond to heat shock and other forms of environmental stress by inducing synthesis of heat-shock proteins (hsp). The 90 kDa heat shock protein, Hsp90, is one of the most abundant proteins in eukaryotic cells, comprising 1Â2% of cellular proteins under non-stress conditions. Its contribution to various cellular processes including signal transduction, protein folding, protein degradation and morphological evolution has been extensively studied. The full functional activity of Hsp90 is gained in concert with other co-chaperones, playing an important role in the folding of newly synthesised proteins and stabilisation and refolding of denatured proteins after stress. Apart from its co-chaperones, Hsp90 binds to an array of client proteins, where the co-chaperone requirement varies and depends on the actual client.

    The sequences of hsp90s show a distinctive domain structure, with a highly-conserved N-terminal domain separated from a conserved, acidic C-terminal domain by a highly-acidic, flexible linker region.

    Proteins where this domain is known:
    PF07_0029    PF11_0188    PF14_0417    PFL1070c   


    PTHR11528:SF18 - PTHR11528:SF18 (Panther link)

    Proteins where this domain is known:
    PF14_0417   


    PTHR11528:SF19 - PTHR11528:SF19 (Panther link)

    Proteins where this domain is known:
    PFL1070c   


    PTHR11528:SF24 - PTHR11528:SF24 (Panther link)

    Proteins where this domain is known:
    PF11_0188   


    PTHR11533 - Peptidase_M1 (Panther link)

    Interpro entry IPR001930 : Peptidase M1, membrane alanine aminopeptidase (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belong to the MEROPS peptidase family M1 (clan MA(E)), the type example being aminopeptidase N from Homo sapiens (Human). The protein fold of the peptidase domain for members of this family resembles that of thermolysin, the type example for clan MA.

    Membrane alanine aminopeptidase is part of the HEXXH+E group; it consists entirely of aminopeptidases, spread across a wide variety of species. Functional studies show that CD13/APN catalyzes the removal of single amino acids from the amino terminus of small peptides and probably plays a role in their final digestion; one family member (leukotriene-A4 hydrolase) is known to hydrolyse the epoxide leukotriene-A4 to form an inflammatory mediator. This hydrolase has been shown to have aminopeptidase activity, and the zinc ligands of the M1 family were identified by site-directed mutagenesis on this enzyme CD13 participates in trimming peptides bound to MHC class II molecules and cleaves MIP-1 chemokine, which alters target cell specificity from basophils to eosinophils. CD13 acts as a receptor for specific strains of RNA viruses (coronaviruses) which cause a relatively large percentage of upper respiratory trace infections.

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).

    Proteins where this domain is known:
    MAL13P1.56    PF14_0692   


    PTHR11533:SF35 - PTHR11533:SF35 (Panther link)

    Proteins where this domain is known:
    PF14_0692   


    PTHR11533:SF8 - Pept_M1_pepN (Panther link)

    Interpro entry IPR012779 : Peptidase M1, alanyl aminopeptidase (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    The M1 family of zinc metallopeptidases contains a number of distinct, well-separated clades of proteins with aminopeptidase activity. Several are designated aminopeptidase N after the Escherichia coli enzyme, suggesting a similar activity profile (seefor a description of catalytic activity).

    This family of zinc metallopeptidases belong to MEROPS peptidase family M1 (aminopeptidase N, clan MA); the majority are identified as alanyl aminopeptidases (proteobacteria) that are closely related to E. coli PepN and presumed to have a similar (not identical) function. Nearly all are found in proteobacteria, but members are found also in cyanobacteria, plants, and apicomplexan parasites. This family differs greatly in sequence from the family of aminopeptidases typified by Streptomyces lividans PepN and from the membrane bound aminopeptidase N family in animals.

    Proteins where this domain is known:
    MAL13P1.56   


    PTHR11537 - PTHR11537 (Panther link)

    Proteins where this domain is known:
    PFL1315w   


    PTHR11537:SF14 - PTHR11537:SF14 (Panther link)

    Proteins where this domain is known:
    PFL1315w   


    PTHR11538 - tRNA-synt_2d (Panther link)

    Interpro entry IPR002319 : Phenylalanyl-tRNA synthetase, class IIc, conserved region (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Phenylalanyl-tRNA synthetase is an alpha2/beta2 tetramer composed of 2 subunits that belongs to class IIc. In eubacteria, a small subunit (pheS gene) can be designated as beta (E. coli) or alpha subunit (nomenclature adopted in InterPro). Reciprocally the large subunit (pheT gene) can be designated as alpha (E. coli) or beta (see. In all other kingdoms the two subunits have equivalent length in eukaryota, and can be identified by specific signatures. The enzyme from Thermus thermophilus has an alpha 2 beta 2 type quaternary structure and is one of the most complicated members of the synthetase family. Identification of phenylalanyl-tRNA synthetase as a member of class II aaRSs was based only on sequence alignment of the small alpha-subunit with other synthetases.

    Proteins where this domain is known:
    PFA0480w    PFF0180w    PFL1540c   


    PTHR11538:SF15 - PTHR11538:SF15 (Panther link)

    Proteins where this domain is known:
    PFA0480w    PFL1540c   


    PTHR11540 - MALATE AND LACTATE DEHYDROGENASE (Panther link)

    Proteins where this domain is known:
    PF13_0141    PF13_0144    PFF0895w   


    PTHR11540:SF4 - MALATE DEHYDROGENASE (Panther link)

    Proteins where this domain is known:
    PF13_0141    PF13_0144    PFF0895w   


    PTHR11544 - PTHR11544 (Panther link)

    Proteins where this domain is known:
    PFA0470c   


    PTHR11545 - Ribosomal_L13 (Panther link)

    Interpro entry IPR005822 : Ribosomal protein L13 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.

    Proteins where this domain is known:
    PF10_0043    PFB0645c   


    PTHR11545:SF2 - Ribosom_L13_bac (Panther link)

    Interpro entry IPR005823 : Ribosomal protein L13, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit. This entry represents ribosomal protein L13 from bacteria, mitochondria and chloroplasts.

    Proteins where this domain is known:
    PFB0645c   


    PTHR11545:SF3 - Ribosomal_L13e/a (Panther link)

    Interpro entry IPR005755 : Ribosomal protein L13, eukaryotic/archaeal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit. This model represents ribosomal protein of L13 from the Archaea and from the eukaryotic cytosol.

    Proteins where this domain is known:
    PF10_0043   


    PTHR11546 - PTHR11546 (Panther link)

    Proteins where this domain is known:
    PFF0320c   


    PTHR11546:SF3 - PTHR11546:SF3 (Panther link)

    Proteins where this domain is known:
    PFF0320c   


    PTHR11549 - DIHYDROFOLATE REDUCTASE (Panther link)

    Proteins where this domain is known:
    PFD0830w   


    PTHR11549:SF2 - Thymidylat_synth_C (Panther link)

    Interpro entry IPR000398 : Thymidylate synthase, C-terminal (Interpro link)

    Interpro description:
    Thymidylate synthase catalyzes the reductive methylation of dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate to dihydrofolate:
     5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMP 
    This provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.

    Proteins where this domain is known:
    PFD0830w   


    PTHR11550 - PyrG_synth (Panther link)

    Interpro entry IPR004468 : CTP synthase (Interpro link)

    Interpro description:

    CTP synthase is involved in pyrimidine ribonucleotide/ribonucleoside metabolism, catalysing the synthesis of CTP from UTP by amination of the pyrimidine ring at the 4-position. The enzyme exists as a dimer of identical chains that aggregates as a tetramer. This gene has been found roughly 500 bp upstream of enolase in both beta (Nitrosomonas europaea) and gamma (Escherichia coli) subdivisions of Proteobacterium.

    Proteins where this domain is known:
    PF14_0100   


    PTHR11557 - Porphobil_deam (Panther link)

    Interpro entry IPR000860 : Tetrapyrrole biosynthesis, hydroxymethylbilane synthase (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.

    This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).

    Proteins where this domain is known:
    PFL0480w   


    PTHR11558 - Sprmine_synthase (Panther link)

    Interpro entry IPR001045 : Spermine synthase (Interpro link)

    Interpro description:
    Synonym(s): Spermidine aminopropyltransferase

    A group of polyamine biosynthetic enzymes involved in the fifth (last) step in the biosynthesis of spermidine from arginine and methionine which includes; spermidine synthase, spermine synthase and putrescine N-methyltransferase.

    The Thermotoga maritima spermidine synthase monomer consists of two domains: an N-terminal domain composed of six beta-strands, and a Rossmann-like C- terminal domain. The larger C-terminal catalytic core domain consists of a seven-stranded beta-sheet flanked by nine alpha helices. This domain resembles a topology observed in a number of nucleotide and dinucleotide-binding enzymes, and in S-adenosyl-L-methionine (AdoMet)- dependent methyltransferase (MTases).

    Proteins where this domain is known:
    PF11_0301   


    PTHR11558:SF11 - PTHR11558:SF11 (Panther link)

    Proteins where this domain is known:
    PF11_0301   


    PTHR11562 - Cation_efflux (Panther link)

    Interpro entry IPR002524 : Cation efflux protein (Interpro link)

    Interpro description:

    Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are considered to be efflux pumps that remove these ions from cells, however others are implicated in ion uptake. The family has six predicted transmembrane domains. Members of the family are variable in length because of variably sized inserts, often containing low-complexity sequence.

    Proteins where this domain is known:
    PF07_0065   


    PTHR11564 - PTHR11564 (Panther link)

    Proteins where this domain is known:
    PF13_0350    PF14_0477   


    PTHR11564:SF5 - PTHR11564:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0477   


    PTHR11564:SF8 - PTHR11564:SF8 (Panther link)

    Proteins where this domain is known:
    PF13_0350   


    PTHR11566 - DYNAMIN (Panther link)

    Proteins where this domain is known:
    PF10_0368    PF11_0465   


    PTHR11566:SF12 - DYNAMIN (Panther link)

    Proteins where this domain is known:
    PF10_0368    PF11_0465   


    PTHR11567 - PTHR11567 (Panther link)

    Proteins where this domain is known:
    MAL8P1.136   


    PTHR11567:SF25 - PTHR11567:SF25 (Panther link)

    Proteins where this domain is known:
    MAL8P1.136   


    PTHR11571 - GLUTATHIONE S-TRANSFERASE (Panther link)

    Proteins where this domain is known:
    PF14_0187   


    PTHR11571:SF2 - GLUTATHIONE S-TRANSFERASE CLASS PI (Panther link)

    Proteins where this domain is known:
    PF14_0187   


    PTHR11573 - Ribncl_red_lg_C (Panther link)

    Interpro entry IPR000788 : Ribonucleotide reductase large subunit, C-terminal (Interpro link)

    Interpro description:

    Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides. It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.

    Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain.

    The reduction of ribonucleotides to deoxyribonucleotides involves the transfer of free radicals, the function of each metallocofactor is to generate an active site thiyl radical. This thiyl radical then initiates the nucleotide reduction process by hydrogen atom abstraction from the ribonucleotide. The radical-based reaction involves five cysteines: two of these are located at adjacent anti-parallel strands in a new type of ten-stranded alpha/beta-barrel; two others reside at the carboxyl end in a flexible arm; and the fifth, in a loop in the centre of the barrel, is positioned to initiate the radical reaction. There are several regions of similarity in the sequence of the large chain of prokaryotes, eukaryotes and viruses spread across 3 domains: an N-terminal domain common to the mammalian and bacterial enzymes; a C-terminal domain common to the mammalian and viral ribonucleotide reductases; and a central domain common to all three.

    Proteins where this domain is known:
    PF14_0352   


    PTHR11573:SF6 - PTHR11573:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0352   


    PTHR11579 - PCMT (Panther link)

    Interpro entry IPR000682 : Protein-L-isoaspartate(D-aspartate) O-methyltransferase (Interpro link)

    Interpro description:

    Protein-L-isoaspartate(D-aspartate) O-methyltransferase (PCMT) (which is also known as L-isoaspartyl protein carboxyl methyltransferase) is an enzyme that catalyses the transfer of a methyl group from S-adenosylmethionine to the free carboxyl groups of D-aspartyl or L-isoaspartyl residues in a variety of peptides and proteins. The enzyme does not act on normal L-aspartyl residues L-isoaspartyl and D-aspartyl are the products of the spontaneous deamidation and/or isomerisation of normal L-aspartyl and L-asparaginyl residues in proteins. PCMT plays a role in the repair and/or degradation of these damaged proteins; the enzymatic methyl esterification of the abnormal residues can lead to their conversion to normal L-aspartyl residues. The SAM domain is present in most of these proteins.

    Proteins where this domain is known:
    PF14_0309   


    PTHR11581 - Ribosomal_S4E (Panther link)

    Interpro entry IPR000876 : Ribosomal protein S4e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families includes yeast S7 (YS6); archaeal S4e; and mammalian and plant cytoplasmic S4. Two highly similar isoforms of mammalian S4 exist, one coded by a gene on chromosome Y, and the other on chromosome X. These proteins have 233 to 264 amino acids.

    Proteins where this domain is known:
    PF11_0065   


    PTHR11583 - PTHR11583 (Panther link)

    Proteins where this domain is known:
    PFI1670c   


    PTHR11584 - SERINE/THREONINE PROTEIN KINASE (Panther link)

    Proteins where this domain is known:
    MAL7P1.18   


    PTHR11588 - Tubulin (Panther link)

    Interpro entry IPR000217 : Tubulin (Interpro link)

    Interpro description:

    Microtubules are polymers of tubulin, a dimer of two 55-kDa subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end.

    For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.

    Most species, excepting simple eukaryotes, express a variety of closely- related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly.

    Proteins where this domain is known:
    PF08_0125    PF10_0084    PF14_0725    PFD1050w    PFI0180w    PFI1635w   


    PTHR11588:SF10 - Alpha_tubulin (Panther link)

    Interpro entry IPR002452 : Alpha tubulin (Interpro link)

    Interpro description:
    Microtubules are polymers of tubulin, a dimer of two 55 kD subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end. For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.

    Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. British type familial amyloidosis is an autosomal dominant disease characterised by progressive dementia, spastic paralysis and ataxia. Amyloid deposits from the brain tissue of an individual who died with this disease have been characterised. Trypsin digestion and subsequent N-terminal sequence analysis yielded a number of short sequences, all of which are tryptic fragments of the C-termini of human alpha- and beta-tubulin. Consistent with the definition of amyloid, synthetic peptides based on the sequences of these fragments formed fibrils in vitro, suggesting that the C-termini of both alpha- and beta-tubulin are closely associated with the amyloid deposits of this type of amyloidosis. Several alpha-tubulin isotypes have been described, each distinguished by the presence of unique amino acid substitutions within the coding region. Most of these isotype-specific amino acids are clustered at the C-terminus. Patterns of developmental expression of the various alpha-tubulin isotypes have been studied. Results suggest that individual tubulin isotypes confer functional specificity on different kinds of microtubules.

    Proteins where this domain is known:
    PFD1050w    PFI0180w   


    PTHR11588:SF13 - PTHR11588:SF13 (Panther link)

    Proteins where this domain is known:
    PF14_0725   


    PTHR11588:SF4 - Delta_tubulin (Panther link)

    Interpro entry IPR002967 : Delta tubulin (Interpro link)

    Interpro description:
    Microtubules are polymers of tubulin, a dimer of two 55 kD subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end. For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site. Most species, excepting simple eukaryotes, express a variety of closely-related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. Gamma tubulin is found at microtubule-organising centres, such as the spindle poles or the centrosome, suggesting that it is involved in minus-end nucleation of microtubule assembly. More recently, a new delta-type tubulin has been identified in Chlamydomonas reinhardtii and Mus musculus (Mouse), and is likely to be found in a number of other species.

    Proteins where this domain is known:
    PFI1635w   


    PTHR11588:SF7 - PTHR11588:SF7 (Panther link)

    Proteins where this domain is known:
    PF08_0125   


    PTHR11588:SF9 - Beta_tubulin (Panther link)

    Interpro entry IPR002453 : Beta tubulin (Interpro link)

    Interpro description:
    Microtubules are polymers of tubulin, a dimer of two 55 kD subunits, designated alpha and beta. Within the microtubule lattice, alpha-beta heterodimers associate in a head-to-tail fashion, giving rise to microtubule polarity. Fluorescent labelling studies have suggested that tubulin is oriented in microtubules with beta-tubulin toward the plus end. For maximal rate and extent of polymerisation into microtubules, tubulin requires GTP. Two molecules of GTP are bound at different sites, termed N and E. At the E (Exchangeable) site, GTP is hydrolysed during incorporation into the microtubule. Close to the E site is an invariant region rich in glycine residues, which is found in both chains and is thought to control access of the nucleotide to its binding site.

    Most species, excepting simple eukaryotes, express a variety of closely related alpha- and beta-isotypes. A third family member, gamma tubulin, has also been identified in a number of species. British type familial amyloidosis is an autosomal dominant disease characterised by progressive dementia, spastic paralysis and ataxia. Amyloid deposits from the brain tissue of an individual who died with this disease have been characterised. Trypsin digestion and subsequent N-terminal sequence analysis yielded a number of short sequences, all of which are tryptic fragments of the C-termini of human alpha- and beta-tubulin. Consistent with the definition of amyloid, synthetic peptides based on the sequences of these fragments formed fibrils in vitro, suggesting that the C-termini of both alpha- and beta-tubulin are closely associated with the amyloid deposits of this type of amyloidosis. The amino acid sequences encoded by beta tubulin genes have revealed a high level of overall similarity, but significant divergence between their C-termini. The pattern of expression of the beta-tubulin genes has been studied in several different human cell lines and has revealed varying levels of and differential expression in different cell lines. It appears that distinct human beta-tubulin isotypes are encoded by genes whose exon size and number has been conserved evolutionarily, but whose pattern of expression may be regulated either co-ordinately or uniquely.

    Proteins where this domain is known:
    PF10_0084   


    PTHR11592 - Glut_peroxidase (Panther link)

    Interpro entry IPR000889 : Glutathione peroxidase (Interpro link)

    Interpro description:

    Glutathione peroxidase (GSHPx) is an enzyme that catalyses the reduction of hydroxyperoxides by glutathione. Its main function is to protect against the damaging effect of endogenously formed hydroxyperoxides. In higher vertebrates, several forms of GSHPx are known, including a ubiquitous cytosolic form (GSHPx-1), a gastrointestinal cytosolic form (GSHPx-GI), a plasma secreted form (GSHPx-P), and an epididymal secretory form (GSHPx-EP). In addition to these characterised forms, the sequence of a protein of unknown function has been shown to be evolutionary related to those of GSHPx's.

    In filarial nematode parasites, the major soluble cuticular protein (gp29) is a secreted GSHPx, which may provide a mechanism of resistance to the immune reaction of the mammalian host by neutralising the products of the oxidative burst of leukocytes. The Escherichia coli protein btuE, a periplasmic protein involved in vitamin B12 transport, is evolutionarily related to GSHPxs, although the significance of this relationship is unclear. The structure of bovine seleno-glutathione peroxidase has been determined. The protein belongs to the alpha-beta class, with a 3 layer(aba) sandwich architecture. The catalyic site of GSHPx contains a conserved residue which is either a cysteine or, in many eukaryotic GSHPx, a selenocysteine.

    Proteins where this domain is known:
    PFL0595c   


    PTHR11593 - Ribosomal_L22/17 (Panther link)

    Interpro entry IPR005721 : Ribosomal protein L22/L17, eukaryotic/archaeal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family describes the ribosomal protein of the eukaryotic cytosol and of the Archaea, variously designated as L17, L22, and L23.

    Proteins where this domain is known:
    PF13_0268   


    PTHR11594 - Ribosomal_S27E (Panther link)

    Interpro entry IPR000592 : Ribosomal protein S27e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaeal ribosomal proteins can be grouped on the basis of sequence similarities. One of these families include mammalian, yeast, Chlamydomonas reinhardtii and Entamoeba histolytica S27, and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0250. These proteins have from 62 to 87 amino acids. They contain, in their central section, a putative zinc-finger region of the type C-x(2)-C-x(14)-C-x(2)-C.

    Proteins where this domain is known:
    PF13_0045   


    PTHR11595 - ELONGATION FACTOR 1-BETA (Panther link)

    Proteins where this domain is known:
    PFC0870w    PFI0645w   


    PTHR11595:SF1 - gb def: Translation elongation factor 1 beta-related (Fragment) (Panther link)

    Proteins where this domain is known:
    PFI0645w   


    PTHR11599 - PTHR11599 (Panther link)

    Proteins where this domain is known:
    MAL13P1.270    MAL8P1.128    MAL8P1.142    PF07_0112    PF10_0111    PF13_0156    PF13_0282    PF14_0676    PF14_0716    PFA0400c    PFC0745c    PFE0915c    PFF0420c    PFI1545c   


    PTHR11599:SF1 - PTHR11599:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0156   


    PTHR11599:SF10 - PTHR11599:SF10 (Panther link)

    Proteins where this domain is known:
    PFC0745c   


    PTHR11599:SF11 - PTHR11599:SF11 (Panther link)

    Proteins where this domain is known:
    MAL8P1.128   


    PTHR11599:SF12 - PTHR11599:SF12 (Panther link)

    Proteins where this domain is known:
    PF14_0716   


    PTHR11599:SF13 - PTHR11599:SF13 (Panther link)

    Proteins where this domain is known:
    PF13_0282   


    PTHR11599:SF14 - PTHR11599:SF14 (Panther link)

    Proteins where this domain is known:
    PF07_0112   


    PTHR11599:SF15 - PTHR11599:SF15 (Panther link)

    Proteins where this domain is known:
    MAL13P1.270   


    PTHR11599:SF16 - PTHR11599:SF16 (Panther link)

    Proteins where this domain is known:
    PFF0420c   


    PTHR11599:SF3 - PTHR11599:SF3 (Panther link)

    Proteins where this domain is known:
    PF10_0111   


    PTHR11599:SF4 - PTHR11599:SF4 (Panther link)

    Proteins where this domain is known:
    PFI1545c   


    PTHR11599:SF5 - PROTEASOME SUBUNIT BETA TYPE 4 (Panther link)

    Proteins where this domain is known:
    MAL8P1.142   


    PTHR11599:SF6 - PTHR11599:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0676   


    PTHR11599:SF7 - PTHR11599:SF7 (Panther link)

    Proteins where this domain is known:
    PFA0400c   


    PTHR11599:SF8 - PTHR11599:SF8 (Panther link)

    Proteins where this domain is known:
    PFE0915c   


    PTHR11600 - PTHR11600 (Panther link)

    Proteins where this domain is known:
    PF11_0310    PFB0210c    PFI0955w   


    PTHR11600:SF73 - gb def: Putative sugar transporter (Panther link)

    Proteins where this domain is known:
    PFB0210c   


    PTHR11601 - PTHR11601 (Panther link)

    Proteins where this domain is known:
    MAL7P1.150    PF07_0068   


    PTHR11606 - PTHR11606 (Panther link)

    Proteins where this domain is known:
    PF08_0132    PF14_0164    PF14_0286   


    PTHR11606:SF1 - PTHR11606:SF1 (Panther link)

    Proteins where this domain is known:
    PF08_0132   


    PTHR11606:SF2 - GLFV_DH (Panther link)

    Interpro entry IPR006095 : Glutamate/phenylalanine/leucine/valine dehydrogenase (Interpro link)

    Interpro description:

    Glutamate, leucine, phenylalanine and valine dehydrogenases are structurally and functionally related. They contain a Gly-rich region containing a conserved Lys residue, which has been implicated in the catalytic activity, in each case a reversible oxidative deamination reaction.

    Glutamate dehydrogenases (GluDH) are enzymes that catalyse the NAD- and/or NADP-dependent reversible deamination of L-glutamate into alpha-ketoglutarate. GluDH isozymes are generally involved with either ammonia assimilation or glutamate catabolism. Two separate enzymes are present in yeasts: the NADP-dependent enzyme, which catalyses the amination of alpha-ketoglutarate to L-glutamate; and the NAD-dependent enzyme, which catalyses the reverse reaction - this form links the L-amino acids with the Krebs cycle, which provides a major pathway for metabolic interconversion of alpha-amino acids and alpha- keto acids.

    Leucine dehydrogenase (LeuDH) is a NAD-dependent enzyme that catalyses the reversible deamination of leucine and several other aliphatic amino acids to their keto analogues. Each subunit of this octameric enzyme from Bacillus sphaericus contains 364 amino acids and folds into two domains, separated by a deep cleft. The nicotinamide ring of the NAD+ cofactor binds deep in this cleft, which is thought to close during the hydride transfer step of the catalytic cycle.

    Phenylalanine dehydrogenase (PheDH) is na NAD-dependent enzyme that catalyses the reversible deamidation of L-phenylalanine into phenyl-pyruvate.

    Valine dehydrogenase (ValDH) is an NADP-dependent enzyme that catalyses the reversible deamidation of L-valine into 3-methyl-2-oxobutanoate.

    Proteins where this domain is known:
    PF14_0164    PF14_0286   


    PTHR11614 - PTHR11614 (Panther link)

    Proteins where this domain is known:
    MAL7P1.178    PF07_0005    PF07_0040    PF10_0018    PF10_0020    PF10_0379    PF14_0017    PF14_0737    PF14_0738    PFA0120c    PFI1775w    PFI1800w    PFL2530w   


    PTHR11614:SF20 - PTHR11614:SF20 (Panther link)

    Proteins where this domain is known:
    PF10_0018    PFI1775w   


    PTHR11615 - PTHR11615 (Panther link)

    Proteins where this domain is known:
    PFF0685c   


    PTHR11615:SF31 - PTHR11615:SF31 (Panther link)

    Proteins where this domain is known:
    PFF0685c   


    PTHR11616 - Na/ntran_symport (Panther link)

    Interpro entry IPR000175 : Sodium:neurotransmitter symporter (Interpro link)

    Interpro description:

    Neurotransmitter transport systems are integral to the release, re-uptake and recycling of neurotransmitters at synapses. High affinity transport proteins found in the plasma membrane of presynaptic nerve terminals and glial cells are responsible for the removal from the extracellular space of released-transmitters, thereby terminating their actions. Plasma membrane neurotransmitter transporters fall into two structurally and mechanistically distinct families. The majority of the transporters constitute an extensive family of homologous proteins that derive energy from the co-transport of Na+ and Cl-, in order to transport neurotransmitter molecules into the cell against their concentration gradient. The family has a common structure of 12 presumed transmembrane helices and includes carriers for gamma-aminobutyric acid (GABA), noradrenaline/adrenaline, dopamine, serotonin, proline, glycine, choline, betaine and taurine. They are structurally distinct from the second more-restricted family of plasma membrane transporters, which are responsible for excitatory amino acid transport. The latter couple glutamate and aspartate uptake to the cotransport of Na+ and the counter-transport of K+, with no apparent dependence on Cl-. In addition, both of these transporter families are distinct from the vesicular neurotransmitter transporters.

    Sequence analysis of the Na+/Cl- neurotransmitter superfamily reveals that it can be divided into four subfamilies, these being transporters for monoamines, the amino acids proline and glycine, GABA, and a group of orphan transporters.

    Proteins where this domain is known:
    PFB0435c    PFE0775c   


    PTHR11616:SF1 - PTHR11616:SF1 (Panther link)

    Proteins where this domain is known:
    PFE0775c   


    PTHR11616:SF4 - PTHR11616:SF4 (Panther link)

    Proteins where this domain is known:
    PFB0435c   


    PTHR11618 - TFIIB_euk_relate (Panther link)

    Interpro entry IPR000812 : Transcription factor TFIIB related (Interpro link)

    Interpro description:

    In eukaryotes, transcription initiation by polymerase II is modulated by both general and specific transcription factors. The general factors (which include TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIG and TFIIH) operate through common promoter elements, such as the TATA box. Transcription factor IIB (TFIIB) is of central importance in transcription of class II genes. It associates with TFIID-TFIIA bound to DNA (the DA complex) to form a ternary TFIID-IIA-IBB (DAB) complex, which is recognised by RNA polymerase II. TFIIB comprises ~315-340 residues and contains an imperfect C-terminal repeat of a 75-residue domain that may contribute to the symmetry of the folded protein.

    Proteins where this domain is known:
    PF14_0469    PFA0525w   


    PTHR11618:SF4 - PTHR11618:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0469   


    PTHR11620 - PTHR11620 (Panther link)

    Proteins where this domain is known:
    PF13_0132   


    PTHR11621 - UBQ-conjugat_E2 (Panther link)

    Interpro entry IPR000608 : Ubiquitin-conjugating enzyme, E2 (Interpro link)

    Interpro description:

    The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.

    There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.

    Proteins where this domain is known:
    MAL13P1.227    PF08_0085    PF10_0330    PF13_0301    PF14_0128    PFC0255c    PFC0855w    PFE1350c    PFF0305c    PFI0740c    PFI1030c    PFL0190w    PFL2100w    PFL2175w   


    PTHR11621:SF16 - PTHR11621:SF16 (Panther link)

    Proteins where this domain is known:
    MAL13P1.227    PF14_0128   


    PTHR11621:SF17 - Ubc12 (Panther link)

    Interpro entry IPR015580 : RUB1 conjugating enzyme Ubc12 (Interpro link)

    Interpro description:

    Ubiquitin-conjugating enzymes (UBC or E2 enzymes) catalyze the covalent attachment of ubiquitin to target proteins. An activated ubiquitin moiety is transferred from an ubiquitin-activating enzyme (E1) to E2, which later ligates ubiquitin directly to substrate proteins with or without the assistance of 'N-end' recognizing proteins (E3). A cysteine residue is required for ubiquitin-thiolester formation. There is a single conserved cysteine in UBC's and the region around that residue is conserved in the sequence of known UBC isozymes. There are, however, exceptions, TSG101 is one of several UBC homologues that lacks this active site cysteine. In most species there are many forms of UBC (at least 9 in yeast) which are implicated in diverse cellular functions.

    The specificity of ubiquitination is conferred primarily by interactions of substrates with specific ubiquitin protein ligases (E3s) in association with ubiquitin conjugating enzymes (E2s).

    Ubc12 is an E2 conjugating enzyme for RUB1, a ubiquitin-like protein displaying 53% amino acid identity to ubiquitin. It is evolutionarily conserved across species ranging from Arabidopsis thaliana (Mouse-ear cress) to Homo sapiens (Human).

    Proteins where this domain is known:
    PFL2175w   


    PTHR11621:SF19 - Ubq_conj_E2 (Panther link)

    Interpro entry IPR015581 : Ubiquitin-conjugating enzyme (Interpro link)

    Interpro description:

    Ubiquitin-conjugating enzymes (UBC or E2 enzymes) catalyze the covalent attachment of ubiquitin to target proteins. An activated ubiquitin moiety is transferred from an ubiquitin-activating enzyme (E1) to E2 which later ligates ubiquitin directly to substrate proteins with or without the assistance of 'N-end' recognizing proteins (E3). A cysteine residue is required for ubiquitin-thiolester formation. There is a single conserved cysteine in UBC's and the region around that residue is conserved in the sequence of known UBC isozymes. There are, however, exceptions, TSG101 is one of several UBC homologues that lacks this active site cysteine. In most species there are many forms of UBC (at least 9 in yeast) which are implicated in diverse cellular functions.

    The specificity of ubiquitination is conferred primarily by interactions of substrates with specific ubiquitin protein ligases (E3s) in association with ubiquitin conjugating enzymes (E2s).

    This entry includes 24 KD E2 ubiquitin-conjugating enzymes from Arabidopsis thaliana (Mouse-ear cress), Schizosaccharomyces pombe (Fission yeast), Drosophila melanogaster and others.

    Proteins where this domain is known:
    PF10_0330   


    PTHR11621:SF25 - PTHR11621:SF25 (Panther link)

    Proteins where this domain is known:
    PFF0305c   


    PTHR11621:SF28 - PTHR11621:SF28 (Panther link)

    Proteins where this domain is known:
    PF13_0301   


    PTHR11621:SF29 - UBIQUITIN-CONJUGATING ENZYME E2 (Panther link)

    Proteins where this domain is known:
    PFL0190w   


    PTHR11621:SF30 - PTHR11621:SF30 (Panther link)

    Proteins where this domain is known:
    PFE1350c   


    PTHR11621:SF32 - PTHR11621:SF32 (Panther link)

    Proteins where this domain is known:
    PFI0740c   


    PTHR11621:SF33 - PTHR11621:SF33 (Panther link)

    Proteins where this domain is known:
    PF08_0085    PFC0855w   


    PTHR11621:SF36 - PTHR11621:SF36 (Panther link)

    Proteins where this domain is known:
    PFI1030c   


    PTHR11621:SF7 - PTHR11621:SF7 (Panther link)

    Proteins where this domain is known:
    PFL2100w   


    PTHR11621:SF8 - PTHR11621:SF8 (Panther link)

    Proteins where this domain is known:
    PFC0255c   


    PTHR11624 - PTHR11624 (Panther link)

    Proteins where this domain is known:
    MAL13P1.186    PF14_0441    PFE0225w    PFF0530w   


    PTHR11624:SF1 - BacTransketolase (Panther link)

    Interpro entry IPR005478 : Bacterial transketolase (Interpro link)

    Interpro description:

    Transketolase (TK) catalyzes the reversible transfer of a two-carbon ketol unit from xylulose 5-phosphate to an aldose receptor, such as ribose 5-phosphate, to form sedoheptulose 7-phosphate and glyceraldehyde 3- phosphate. This enzyme, together with transaldolase, provides a link between the glycolytic and pentose-phosphate pathways. TK requires thiamine pyrophosphate as a cofactor.

    This group includes two proteins from the yeast Saccharomyces cerevisiae (Baker's yeast) but excludes dihydroxyactetone synthases (formaldehyde transketolases) from various yeasts and the even more distant mammalian transketolases. Among the family of thiamine diphosphate-dependent enzymes that includes transketolases, dihydroxyacetone synthases, pyruvate dehydrogenase E1-beta subunits, and deoxyxylulose-5-phosphate synthases, mammalian and bacterial transketolases seem not to be orthologous.

    Proteins where this domain is known:
    PFF0530w   


    PTHR11624:SF11 - PTHR11624:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0441   


    PTHR11624:SF20 - 1-DEOXYXYLULOSE-5-PHOSPHATE SYNTHASE (Panther link)

    Proteins where this domain is known:
    MAL13P1.186   


    PTHR11624:SF21 - PTHR11624:SF21 (Panther link)

    Proteins where this domain is known:
    PFE0225w   


    PTHR11627 - Aldolase_I (Panther link)

    Interpro entry IPR000741 : Fructose-bisphosphate aldolase, class-I (Interpro link)

    Interpro description:

    Fructose-bisphosphate aldolase is a glycolytic enzyme that catalyses the reversible aldol cleavage or condensation of fructose-1,6-bisphosphate into dihydroxyacetone-phosphate and glyceraldehyde 3-phosphate. There are two classes of fructose-bisphosphate aldolases with different catalytic mechanisms: class I enzymes are found in animals, do not require a metal ion, and are characterised by the formation of a Schiff base intermediate between a highly conserved active site lysine and a substrate carbonyl group, while the class II enzymes are produced in bacteria and fungi, and require an active-site divalent metal ion. This entry represents the class I enzymes.

    In vertebrates, three forms of this enzyme are found: aldolase A is expressed in muscle, aldolase B in liver, kidney, stomach and intestine, and aldolase C in brain, heart and ovary. The different isozymes have different catalytic functions: aldolases A and C are mainly involved in glycolysis, while aldolase B is involved in both glycolysis and gluconeogenesis. Defects in aldolase B result in hereditary fructose intolerance.

    Proteins where this domain is known:
    PF14_0425   


    PTHR11629 - ATPase_V0/A0_116 (Panther link)

    Interpro entry IPR002490 : ATPase, V0/A0 complex, 116-kDa subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    This entry represents the 116-kDa subunit (or subunit a) and subunit I found in the V0 or A0 complex of V- or A-ATPases, respectively. The 116-kDa subunit is a transmembrane glycoprotein required for the assembly and proton transport activity of the ATPase complex. Several isoforms of the 116-kDa subunit exist, providing a potential role in the differential targeting and regulation of the V-ATPase for specific organelles.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF08_0113   


    PTHR11629:SF9 - PTHR11629:SF9 (Panther link)

    Proteins where this domain is known:
    PF08_0113   


    PTHR11630 - MCM (Panther link)

    Proteins where this domain is known:
    PF07_0023    PF13_0095    PF13_0291    PF14_0177    PFD0790c    PFE1345c    PFL0560c    PFL0580w   


    PTHR11630:SF26 - MCM_7 (Panther link)

    Interpro entry IPR008050 : MCM protein 7 (Interpro link)

    Interpro description:

    The MCM2-7 complex consists of six closely related proteins that are highly conserved throughout the eukaryotic kingdom. During late mitosis and G1, replication origins are 'licensed' for replication by loading the minichromosome maintenance (MCM) 2-7 proteins pre-replicative complex essential for initiating and elongating replication forks during S phase.

    The components of the MCM2-7 complex in Homo sapiens (Human) are:

    .

    Studies in Xenopus eggs have showed the 6 MCM proteins to form hexamers, where each class is present in equal stoichiometry. The initiation of DNA synthesis in eukaryotes requires the binding of origin recognition complex (ORC) - a complex of six subunits - to the autonomously replicating sequences (ARS) of replication origins, the recruitment of CDC6 and binding of the MCM protein complex to the ARS to form the prereplicative complex (pre-RC). DNA synthesis is subsequently initiated by the activation of pre-RC by CDC7 and CDC28 protein kinases.

    MCM proteins associate with chromatin during G1 phase and dissociate again during S phase, remaining unbound until the end of mitosis. Periodic chromatin association of the MCM complex ensures that DNA synthesis from replication origins is initiated only once during the cell cycle, avoiding over-replication of parts of the genome. Elongation of replication forks away from individual replication origins results in displacement of the MCM-containing complex from chromatin. Budding yeast MCM proteins are translocated in and out of the nucleus during each cell cycle. However, fission yeast MCMs, like those in metazoans, are constitutively nuclear.

    The six classes of MCM protein together share a conserved 200 amino acid residue domain, while sequences within the same class show more extensive similarity outside this region. The conserved central domain is similar to the A motif of the Walker-type NTP-binding domain; it also shares similarity with ATPase domains of prokaryotic NtrC-related transcription regulators. The ATP-binding motif is thought to mediate ATP-dependent opening of double-stranded DNA at replication origins. In addition to the central region, MCM2, 4, 6 and 7 contain a zinc-finger-type motif thought to have a role in mediating protein-protein interactions. Moreover, a conserved alpha-helical structure in the C-terminal region has been noted; this comprises a conserved heptad repeat and a putative four-helix bundle. Most of the MCM proteins contain acidic regions, or alternately repeated clusters of acidic and basic residues.

    In addition to its role as a replication factor, the MCM7 protein has DNA helicase activity when complexed as a hexamer (containing two molecules each of MCM4, MCM6 and MCM7), suggesting that this complex is involved in the initiation of DNA replication as a DNA-unwinding enzyme. The human MCM7 gene has been localised to chromosome 7q21.3-q22.1. Increased expression of MCM7 RNA and protein in MYCN-amplified neuroblastoma tumour and cell lines has been reported. Furthermore, The MCM7 protein has been shown to form complexes with the retinoblastoma protein. These findings suggest MCM7- directed DNA replication contributes to neoplastic transformation.

    Proteins where this domain is known:
    PF07_0023   


    PTHR11630:SF42 - DNA REPLICATION LICENSING FACTOR MCM5 (Panther link)

    Proteins where this domain is known:
    PFL0580w   


    PTHR11630:SF43 - PTHR11630:SF43 (Panther link)

    Proteins where this domain is known:
    PF13_0291   


    PTHR11630:SF44 - PTHR11630:SF44 (Panther link)

    Proteins where this domain is known:
    PF14_0177   


    PTHR11630:SF45 - PTHR11630:SF45 (Panther link)

    Proteins where this domain is known:
    PF13_0095   


    PTHR11630:SF46 - PTHR11630:SF46 (Panther link)

    Proteins where this domain is known:
    PFE1345c   


    PTHR11630:SF47 - PTHR11630:SF47 (Panther link)

    Proteins where this domain is known:
    PFL0560c   


    PTHR11630:SF48 - PTHR11630:SF48 (Panther link)

    Proteins where this domain is known:
    PFD0790c   


    PTHR11632 - PTHR11632 (Panther link)

    Proteins where this domain is known:
    PF10_0334   


    PTHR11632:SF5 - PTHR11632:SF5 (Panther link)

    Proteins where this domain is known:
    PF10_0334   


    PTHR11635 - CAMP-DEPENDENT PROTEIN KINASE REGULATORY CHAIN (Panther link)

    Proteins where this domain is known:
    PF14_0173    PFL1110c   


    PTHR11635:SF18 - CAMP-DEPENDENT PROTEIN KINASE TYPE II-ALPHA REGULATORY SUBUNIT (Panther link)

    Proteins where this domain is known:
    PF14_0173    PFL1110c   


    PTHR11638 - PTHR11638 (Panther link)

    Proteins where this domain is known:
    PF08_0063    PF11_0175    PF14_0063   


    PTHR11638:SF10 - PTHR11638:SF10 (Panther link)

    Proteins where this domain is known:
    PF14_0063   


    PTHR11638:SF13 - PTHR11638:SF13 (Panther link)

    Proteins where this domain is known:
    PF11_0175   


    PTHR11638:SF18 - PTHR11638:SF18 (Panther link)

    Proteins where this domain is known:
    PF08_0063   


    PTHR11645 - P5CR (Panther link)

    Interpro entry IPR000304 : Delta 1-pyrroline-5-carboxylate reductase (Interpro link)

    Interpro description:
    Delta 1-pyrroline-5-carboxylate reductase (P5CR) is the enzyme that catalyzes the terminal step in the biosynthesis of proline from glutamate, the NAD(P) dependent oxidation of 1-pyrroline-5-carboxylate into proline.

    Proteins where this domain is known:
    MAL13P1.284   


    PTHR11649 - MSS1/TRME-RELATED GTP-BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    MAL8P1.75a    MAL8P1.99    PF14_0339    PF14_0400    PFC0565w    PFE0665c    PFL0835w   


    PTHR11649:SF10 - gb def: Probable tRNA modification GTPase trme (Panther link)

    Proteins where this domain is known:
    MAL8P1.75a   


    PTHR11649:SF13 - GTP-BINDING PROTEIN ENGB-RELATED (Panther link)

    Proteins where this domain is known:
    MAL8P1.99    PF14_0400    PFE0665c   


    PTHR11649:SF3 - PTHR11649:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0339   


    PTHR11649:SF5 - PTHR11649:SF5 (Panther link)

    Proteins where this domain is known:
    PFC0565w    PFL0835w   


    PTHR11652 - Ribosomal_S12_23 (Panther link)

    Interpro entry IPR006032 : Ribosomal protein S12/S23 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This protein is known typically as S12 in bacteria, S23 in eukaryotes and as either S12 or S23 in the Archaea.

    Bacterial S12 molecules contain a conserved aspartic acid residue which undergoes a novel post-translational modification, beta-methylthiolation, to form the corresponding 3-methylthioaspartic acid.

    Proteins where this domain is known:
    PFC0290w    PFD0600c   


    PTHR11652:SF1 - Ribosom_S12_bac (Panther link)

    Interpro entry IPR005679 : Ribosomal protein S12, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S12 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S12 is known to be involved in the translation initiation step. It is a very basic protein of 120 to 150 amino-acid residues. S12 belongs to a family of ribosomal proteins which are grouped on the basis of sequence similarities. This family consists of ribosomal protein S12 from bacteria, mitochondria, and chloroplasts.

    Proteins where this domain is known:
    PFD0600c   


    PTHR11652:SF2 - PTHR11652:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0290w   


    PTHR11655 - Ribosomal_L6 (Panther link)

    Interpro entry IPR000702 : Ribosomal protein L6 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.

    Proteins where this domain is known:
    PF13_0129   


    PTHR11655:SF3 - PTHR11655:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0129   


    PTHR11659 - GatB (Panther link)

    Interpro entry IPR017959 : Aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase, subunit B /E (Interpro link)

    Interpro description:

    Glutamyl-tRNA(Gln) amidotransferase (Gat; provides a means of producing correctly charged Gln-tRNA(Gln) through the transamidation of mis-acylated Glu-tRNA(Gln) in organisms which lack glutaminyl-tRNA synthetase. The reaction takes place in the presence of glutamine and ATP through an activated gamma-phospho-Glu-tRNA(Gln). The enzyme is composed of three subunits: A (an amidase), B and C. It also exists in eukaryotes as a protein targeted to the mitochondria.

    The heterotrimer GatABC is involved in converting Glu to Gln and/or Asp to Asn, when the amino acid is attached to the appropriate tRNA. In Lactobacillus, GatABC is responsible for producing tRNA(Gln). In Archaea, GatABC is responsible for producing tRNA(Asn), while GatDE is responsible for producing tRNA(Gln). In lineages that include Thermus, Chlamydia, or Acidithiobacillus, the GatABC complex catalyses both tRNA(Gln) and tRNA(Asn).

    This entry represents aspartyl/glutamyl-tRNA(Asn/Gln) amidotransferase subunit B and glutamyl-tRNA(Gln) amidotransferase subunit E.

    Proteins where this domain is known:
    PFF1395c   


    PTHR11661 - Ribosomal_L11 (Panther link)

    Interpro entry IPR000911 : Ribosomal protein L11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.

    Proteins where this domain is known:
    PF11_0113    PFE0850c   


    PTHR11661:SF1 - Ribosom_L11_bac (Panther link)

    Interpro entry IPR006519 : Ribosomal protein L11, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, chloroplast, cyanelle, and most mitochondrial forms of ribosomal protein L11. L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure. This entry represents the bacterial, chloroplast and mitochondrial forms.

    Proteins where this domain is known:
    PF11_0113   


    PTHR11661:SF2 - 60S RIBOSOMAL PROTEIN L12 (Panther link)

    Proteins where this domain is known:
    PFE0850c   


    PTHR11668 - T_phtase_apaH (Panther link)

    Interpro entry IPR006186 : Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase (Interpro link)

    Interpro description:

    Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.

    Protein phosphatase-1 (PP1) and protein phosphatase-2A (PP2A) have a broad specificity and there are two closely related isoforms of each, alpha and beta. PP2A is a trimeric enzyme that consists of a core composed of a catalytic subunit associated with a 65 kDa regulatory subunit and a third variable subunit. Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin, is composed of two subunits the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. Other serine/threonine specific protein phosphatases that have been characterised include mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V), which are closely related but yet distinct from PP2A; yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties; and Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 which are closely related but yet distinct from PP1.

    Proteins where this domain is known:
    MAL13P1.274    PF08_0129    PF10_0177    PF14_0142    PF14_0224    PF14_0630    PFC0595c    PFI1245c    PFI1360c   


    PTHR11668:SF12 - PTHR11668:SF12 (Panther link)

    Proteins where this domain is known:
    MAL13P1.274   


    PTHR11668:SF13 - PTHR11668:SF13 (Panther link)

    Proteins where this domain is known:
    PF14_0224   


    PTHR11668:SF15 - PTHR11668:SF15 (Panther link)

    Proteins where this domain is known:
    PF08_0129   


    PTHR11668:SF17 - PTHR11668:SF17 (Panther link)

    Proteins where this domain is known:
    PF14_0630   


    PTHR11668:SF19 - PROTEIN PHOSPHATASE-1 (Panther link)

    Proteins where this domain is known:
    PF14_0142   


    PTHR11668:SF23 - PTHR11668:SF23 (Panther link)

    Proteins where this domain is known:
    PFI1360c   


    PTHR11668:SF24 - PTHR11668:SF24 (Panther link)

    Proteins where this domain is known:
    PFC0595c   


    PTHR11668:SF28 - PTHR11668:SF28 (Panther link)

    Proteins where this domain is known:
    PFI1245c   


    PTHR11668:SF8 - PTHR11668:SF8 (Panther link)

    Proteins where this domain is known:
    PF10_0177   


    PTHR11669 - REPLICATION FACTOR C / DNA POLYMERASE III GAMMA-TAU SUBUNIT (Panther link)

    Proteins where this domain is known:
    PF11_0117    PF14_0601    PFB0840w    PFL2005w   


    PTHR11670 - Aconitase-like_core (Panther link)

    Interpro entry IPR015937 : Aconitase-like core (Interpro link)

    Interpro description:

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents the core four domains that make up aconitase, as well as the structurally similar core domains of homoaconitase, 3-isopropylmalate dehydratase small and large subunits, 2-methylisocitrate dehydratase (AcnD), and iron regulatory protein 2 (IRP2).

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    PTHR11670:SF1 - Aconitase/Fe_reg_prot_2/AcnD (Panther link)

    Interpro entry IPR015934 : Aconitase/Iron regulatory protein 2/2-methylisocitrate dehydratase (Interpro link)

    Interpro description:

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents several aconitase proteins, including bacterial aconitase A (AcnA), eukaryotic cytosolic aconitase (cAcn) and a few mitochondrial aconitases (mAcn) (but not the majority of mAcn enzymes). In addition, this entry represents the related proteins: iron-regulatory protein 2 (IRP2) and Fe/S-dependent 2-methylisocitrate dehydratase (AcnD;.

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    PTHR11671 - ATPase_V1/A1_D (Panther link)

    Interpro entry IPR002699 : ATPase, V1/A1 complex, subunit D (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    This entry represents the D subunit found in V1 and A1 complexes of V- and A-ATPases, respectively. Subunit D appears to be located in the central stalk, whereas subunits E and G form part of the peripheral stalk connecting V1 and V0. This subunit is the most likely homologue to the gamma subunit of the F1 complex in F-ATPases, which undergoes rotation during ATP hydrolysis and serves an essential function in rotary catalysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0227   


    PTHR11673 - EIF5A_hypusine (Panther link)

    Interpro entry IPR001884 : Eukaryotic initiation factor 5A hypusine (eIF-5A) (Interpro link)

    Interpro description:

    Translation initiation factor 5A (IF-5A) is reported to be involved in the first step of peptide bond formation in translation, to be involved in cell-cycle regulation and to be a cofactor for the Rev and Rex transactivator proteins of human immunodeficiency virus-1 and T-cell leukaemia virus I, respectively. IF-5A contains an unusual amino acid, hypusine N-epsilon-(4-aminobutyl-2-hydroxy)lysine), that is required for its function. The first step in the post-translational modification of lysine to hypusine is catalyzed by the enzyme deoxyhypusine synthase, the structure of which has been reported.

    The crystal structure of IF-5A from the archaeon Pyrobaculum aerophilum has been determined to 1.75 A. Unmodified P. aerophilum IF-5A is found to be a beta structure with two domains and three separate hydrophobic cores. The lysine (Lys42) that is post-translationally modified by deoxyhypusine synthase is found at one end of the IF-5A molecule in a turn between beta strands beta4 and beta5; this lysine residue is freely solvent accessible. The C-terminal domain is found to be homologous to the cold-shock protein CspA of E. coli, which has a well characterised RNA-binding fold, suggesting that IF-5A is involved in RNA binding.

    Proteins where this domain is known:
    PFL0210c   


    PTHR11673:SF2 - INITIATION FACTOR 5A (Panther link)

    Proteins where this domain is known:
    PFL0210c   


    PTHR11679 - Sec1-like (Panther link)

    Interpro entry IPR001619 : Sec1-like protein (Interpro link)

    Interpro description:

    Sec1-like molecules have been implicated in a variety of eukaryotic vesicle transport processes including neurotransmitter release by exocytosis. They regulate vesicle transport by binding to a t-SNARE from the syntaxin family. This process is thought to prevent SNARE complex formation, a protein complex required for membrane fusion. Whereas Sec1 molecules are essential for neurotransmitter release and other secretory events, their interaction with syntaxin molecules seems to represent a negative regulatory step in secretion.

    Proteins where this domain is known:
    PF10_0331    PFB0750w    PFF0665c    PFI1700c   


    PTHR11679:SF1 - PTHR11679:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1700c   


    PTHR11679:SF2 - PTHR11679:SF2 (Panther link)

    Proteins where this domain is known:
    PF10_0331   


    PTHR11679:SF3 - PTHR11679:SF3 (Panther link)

    Proteins where this domain is known:
    PFB0750w   


    PTHR11679:SF6 - PTHR11679:SF6 (Panther link)

    Proteins where this domain is known:
    PFF0665c   


    PTHR11680 - Gly_HO-Metrfase (Panther link)

    Interpro entry IPR001085 : Glycine hydroxymethyltransferase (Interpro link)

    Interpro description:
    Synonym(s): Serine hydroxymethyltransferase, Serine aldolase, Threonine aldolase

    Serine hydroxymethyltransferase (SHMT) is a pyridoxal phosphate (PLP) dependent enzyme and belongs to the aspartate aminotransferase superfamily (fold type I). The pyridoxal-P group is attached to a lysine residue around which the sequence is highly conserved in all forms of the enzyme. The enzyme carries out interconversion of serine and glycine using PLP as the cofactor. SHMT catalyses the transfer of a hydroxymethyl group from N5, N10- methylene tetrahydrofolate to glycine, resulting in the formation of serine and tetrahydrofolate. Both eukaryotic and prokaryotic SHMT enzymes form tight obligate homodimers and the mammalian enzyme forms a homotetramer. PLP dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalysed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), D-amino acid superfamily (fold type IV) and glycogen phophorylase family (fold type V).

    In vertebrates, glycine hydroxymethyltransferase exists in a cytoplasmic and a mitochondrial form whereas only one form is found in prokaryotes.

    Proteins where this domain is known:
    PF14_0534    PFL1720w   


    PTHR11685 - PTHR11685 (Panther link)

    Proteins where this domain is known:
    PFC0175w   


    PTHR11685:SF11 - PTHR11685:SF11 (Panther link)

    Proteins where this domain is known:
    PFC0175w   


    PTHR11693 - ATPase_F1_gamma (Panther link)

    Interpro entry IPR000131 : ATPase, F1 complex, gamma subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0061   


    PTHR11693:SF4 - PTHR11693:SF4 (Panther link)

    Proteins where this domain is known:
    PF13_0061   


    PTHR11700 - Ribosomal_S10 (Panther link)

    Interpro entry IPR001848 : Ribosomal protein S10 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.

    Proteins where this domain is known:
    PF10_0038    PF14_0581   


    PTHR11700:SF2 - Ribos_S10_bac (Panther link)

    Interpro entry IPR005731 : Ribosomal protein S10, bacterial (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Included in the family are one member each from Saccharomyces cerevisiae (Baker's yeast)and Schizosaccharomyces pombe (Fission yeast). These proteins lack an N-terminal mitochondrial transit peptide but contain additional sequence C-terminal to the ribosomal S10 protein region.

    Proteins where this domain is known:
    PF14_0581   


    PTHR11700:SF3 - 40S RIBOSOMAL PROTEIN S20 (Panther link)

    Proteins where this domain is known:
    PF10_0038   


    PTHR11702 - PTHR11702 (Panther link)

    Proteins where this domain is known:
    MAL13P1.294    MAL8P1.33    PF14_0114    PFD0710w    PFE1215c    PFF0385c    PFF0625w   


    PTHR11702:SF3 - GTP_bd_Obg/CgtA (Panther link)

    Interpro entry IPR014100 : (Interpro link)

    Interpro description:

    This entry describes a universal, mostly one-gene-per-genome GTP-binding protein that associates with ribosomal subunits and appears to play a role in ribosomal RNA maturation. Mutations in this gene are pleiotropic, but it appears that effects on cellular functions such as chromosome partition may be secondary to the effect on ribosome structure.

    Proteins where this domain is known:
    MAL8P1.33    PF14_0114   


    PTHR11702:SF4 - PTHR11702:SF4 (Panther link)

    Proteins where this domain is known:
    PFD0710w    PFF0625w   


    PTHR11702:SF6 - PTHR11702:SF6 (Panther link)

    Proteins where this domain is known:
    MAL13P1.294   


    PTHR11702:SF8 - PTHR11702:SF8 (Panther link)

    Proteins where this domain is known:
    PFE1215c   


    PTHR11703 - Deoxyhypus_synth (Panther link)

    Interpro entry IPR002773 : Deoxyhypusine synthase (Interpro link)

    Interpro description:
    Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the post-translational formation of hypusine is catalysed by the enzyme deoxyhypusine synthase (DS) The enzyme catalyses the following reaction:
     Spermidine + [eIF-5A]-lysine = 1,3-diaminopropane + [eIF-5A]-deoxyhypusine 
    The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation. The structure is known for this enzyme in complex with its NAD+ cofactor.

    Proteins where this domain is known:
    PF14_0125   


    PTHR11706 - Nramp (Panther link)

    Interpro entry IPR001046 : Natural resistance-associated macrophage protein (Interpro link)

    Interpro description:

    The natural resistance-associated macrophage protein (NRAMP) family consists of Nramp1, Nramp2, and yeast proteins Smf1 and Smf2. The NRAMP family is a novel family of functionally related proteins defined by a conserved hydrophobic core of ten transmembrane domains. Nramp1 is an integral membrane protein expressed exclusively in cells of the immune system and is recruited to the membrane of a phagosome upon phagocytosis. Nramp2 is a multiple divalent cation transporter for Fe2+, Mn2+ and Zn2+ amongst others. It is expressed at high levels in the intestine; and is major transferrin-independent iron uptake system in mammals. The yeast proteins Smf1 and Smf2 may also transport divalent cations.

    The natural resistance of mice to infection with intracellular parasites is controlled by the Bcg locus, which modulates the cytostatic/cytocidal activity of phagocytes. Nramp1, the gene responsible, is expressed exclusively in macrophages and poly-morphonuclear leukocytes, and encodes a polypeptide (natural resistance-associated macrophage protein) with features typical of integral membrane proteins. Other transporter proteins from a variety of sources also belong to this family.

    Proteins where this domain is known:
    PFE1185w   


    PTHR11706:SF6 - PTHR11706:SF6 (Panther link)

    Proteins where this domain is known:
    PFE1185w   


    PTHR11708 - RAS-RELATED GTPASE (Panther link)

    Proteins where this domain is known:
    MAL13P1.205    MAL13P1.241    MAL13P1.51    PF08_0110    PF11_0183    PF11_0461    PF13_0119    PFA0335w    PFB0500c    PFE0625w    PFE0690c    PFF0810c    PFI0155c    PFL1500w   


    PTHR11708:SF184 - gb def: Rab5b protein-related (Panther link)

    Proteins where this domain is known:
    MAL13P1.51   


    PTHR11708:SF213 - Rab11 (Panther link)

    Interpro entry IPR015595 : (Interpro link)

    Interpro description:

    Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.

    Rab11 is a ubiquitously expressed Rab protein that is involved in the endosomal recycling pathway in mammalian cells and has been shown to co-localise with Sec15. It also co-localises with the transferrin receptor on pericentriolar recycling endosomes (REs) and is involved in recycling of transferrin to the plasma membrane. Rab11 has also been implicated in apical recycling and transcytosis in Madin-Darby canine kidney cells and trans-Golgi network to plasma membrane trafficking via REs in baby hamster kidney cells.

    Proteins where this domain is known:
    MAL13P1.205    PF13_0119   


    PTHR11708:SF227 - RAS-RELATED PROTEIN RAB-1 (Panther link)

    Proteins where this domain is known:
    PFE0625w    PFE0690c   


    PTHR11708:SF245 - Rab18 (Panther link)

    Interpro entry IPR015598 : (Interpro link)

    Interpro description:

    Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.

    In higher plants, Rab18 is a drought-responsive gene involved in proline biosynthesis.

    Proteins where this domain is known:
    PF08_0110   


    PTHR11708:SF246 - PTHR11708:SF246 (Panther link)

    Proteins where this domain is known:
    PFL1500w   


    PTHR11708:SF254 - Rab5_like (Panther link)

    Interpro entry IPR015599 : (Interpro link)

    Interpro description:

    Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.

    Rab5 is a regulatory GTPase that is associated with the sorting endosome and participates in endosomal membrane fusion reactions. Recent experiments have provided insights into Rab5 function by demonstrating direct links between Rab5-interacting proteins and components of the membrane fusion apparatus. In addition, a realisation that Rab5 has additional functions in endosome biogenesis is emerging.

    Proteins where this domain is known:
    PFA0335w    PFB0500c   


    PTHR11708:SF256 - Rab6_like (Panther link)

    Interpro entry IPR015600 : (Interpro link)

    Interpro description:

    Rab-like GTPases are key regulators of most if not all vesicular trafficking events between the various subcellular compartments within the eukaryotic cell. Rab-related proteins have been implicated in regulating the formation of vesicles at the donor membrane, as well as the movement, tethering and docking of vesicles, and their fusion with the acceptor membrane. The regulatory capacity of Rab-like proteins is dependent on their ability to cycle between GTP-bound active and GDP-bound inactive states. Activation of a Rab is coupled to its association with intracellular membranes, allowing it to recruit downstream effector proteins to the cytoplasmic surface of a subcellular compartment.

    Using antibodies to study in vivo trafficking, Rab6 was found to be in its GTP-bound conformation on the Golgi apparatus and transport intermediates, and the geometry of transport intermediates was modulated by Rab6 activity. Recent work showed that dynactin binds to Rab6 and a Rab6-dependent recruitment to Golgi membranes. Other Golgi Rabs do not bind to dynactin and are unable to support its recruitment to membranes. Rab6 therefore functions as a specificity or tethering factor controlling the recruitment of dynactin to membranes.

    Proteins where this domain is known:
    PF11_0461   


    PTHR11708:SF260 - PTHR11708:SF260 (Panther link)

    Proteins where this domain is known:
    PFI0155c   


    PTHR11708:SF269 - GTP-BINDING NUCLEAR PROTEIN RAN (Panther link)

    Proteins where this domain is known:
    PF11_0183   


    PTHR11708:SF276 - PTHR11708:SF276 (Panther link)

    Proteins where this domain is known:
    PFF0810c   


    PTHR11710 - Ribosomal_S19E (Panther link)

    Interpro entry IPR001266 : Ribosomal protein S19e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family includes a number of eukaryotic and archaebacterial ribosomal proteins; mammalian S19, Drosophila S19, Ascaris lumbricoides S19g (ALEP-1) and S19s, yeast YS16 (RP55A and RP55B), Aspergillus S16 and Haloarcula marismortui HS12.

    Proteins where this domain is known:
    PFD1055w   


    PTHR11711 - ARF/SAR (Panther link)

    Interpro entry IPR006689 : ARF/SAR superfamily (Interpro link)

    Interpro description:

    The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.

    The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.

    Proteins where this domain is known:
    MAL13P1.297    PF10_0203    PF10_0337    PF13_0090    PF14_0399    PF14_0485    PFD0810w    PFI1005w    PFI1180w   


    PTHR11711:SF12 - SAR1_GTP_bd (Panther link)

    Interpro entry IPR006687 : GTP-binding protein SAR1 (Interpro link)

    Interpro description:

    The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.

    The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 'up', restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.

    The SAR1 protein, first identified in budding yeast, is a 21 kDa GTP- binding protein involved in vesicular transport between the endoplasmic reticulum and the Golgi. It is a GTP-binding protein that takes part in the formation of secretory vesicles by binding to an ER type II membrane protein, Sec12p. It is evolutionary conserved and seems to be present in all eukaryotes.

    SAR1 is generally included in the RAS 'superfamily' of small GTP-binding proteins, but it is only slightly related to other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at the C terminus and is therefore not subject to prenylation. SAR1 is slightly related to ARFs.

    Proteins where this domain is known:
    PFD0810w   


    PTHR11711:SF26 - PTHR11711:SF26 (Panther link)

    Proteins where this domain is known:
    PF14_0399   


    PTHR11711:SF30 - PTHR11711:SF30 (Panther link)

    Proteins where this domain is known:
    MAL13P1.297    PF10_0203    PF10_0337   


    PTHR11711:SF5 - PTHR11711:SF5 (Panther link)

    Proteins where this domain is known:
    PFI1180w   


    PTHR11711:SF9 - PTHR11711:SF9 (Panther link)

    Proteins where this domain is known:
    PF13_0090   


    PTHR11712 - Ketoacyl_synth (Panther link)

    Interpro entry IPR000794 : Beta-ketoacyl synthase (Interpro link)

    Interpro description:

    Beta-ketoacyl-ACP synthase(KAS) is the enzyme that catalyses the condensation of malonyl-ACP with the growing fatty acid chain. It is found as a component of a number of enzymatic systems, including fatty acid synthetase (FAS), which catalyses the formation of long-chain fatty acids from acetyl-CoA, malonyl-CoA and NADPH; the multi-functional 6-methysalicylic acid synthase (MSAS) from Penicillium patulum, which is involved in the biosynthesis of a polyketide antibiotic; polyketide antibiotic synthase enzyme systems; Emericella nidulans multifunctional protein Wa, which is involved in the biosynthesis of conidial green pigment; Rhizobium nodulation protein nodE, which probably acts as a beta-ketoacyl synthase in the synthesis of the nodulation Nod factor fatty acyl chain; and yeast mitochondrial protein CEM1. The condensation reaction is a two step process, first the acyl component of an activated acyl primer is transferred to a cysteine residue of the enzyme and is then condensed with an activated malonyl donor with the concomitant release of carbon dioxide.

    Proteins where this domain is known:
    PFF1275c   


    PTHR11712:SF23 - PTHR11712:SF23 (Panther link)

    Proteins where this domain is known:
    PFF1275c   


    PTHR11715 - GCV_H (Panther link)

    Interpro entry IPR002930 : Glycine cleavage H-protein (Interpro link)

    Interpro description:

    This is a family of glycine cleavage H-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyses the catabolism of glycine in eukaryotes. A lipoyl group is attached to a completely conserved lysine residue. The H protein shuttles the methylamine group of glycine from the P protein to the T protein.

    Proteins where this domain is known:
    PF11_0339   


    PTHR11717 - Low_mwt_PTPase (Panther link)

    Interpro entry IPR017867 : Protein-tyrosine phosphatase, low molecular weight (Interpro link)

    Interpro description:

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:

    Based on their cellular localisation, PTPases are also classified as:

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    This entry represents the low molecular weight (LMW) protein-tyrosine phosphatases (or acid phosphatase), which act on tyrosine phosphorylated proteins, low-MW aryl phosphates and natural and synthetic acyl phosphates. The structure of a LMW PTPase has been solved by X-ray crystallography and is found to form a single structural domain. It belongs to the alpha/beta class, with 6 alpha-helices and 4 beta-strands forming a 3-layer alpha-beta-alpha sandwich architecture.

    Proteins where this domain is known:
    PFF0515c   


    PTHR11717:SF2 - PTHR11717:SF2 (Panther link)

    Proteins where this domain is known:
    PFF0515c   


    PTHR11721 - PTHR11721 (Panther link)

    Proteins where this domain is known:
    PFF0885w   


    PTHR11722 - Ribosomal_L13E (Panther link)

    Interpro entry IPR001380 : Ribosomal protein L13e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The ribosomal protein L13e is widely found in vertebrates, Drosophila melanogaster, plants, yeast and others.

    Proteins where this domain is known:
    PF08_0075   


    PTHR11726 - Ribosomal_L10E (Panther link)

    Interpro entry IPR001197 : Ribosomal protein L10e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A variety of eukaryotic and plant ribosomal L10e proteins can be grouped. This family consists of vertebrate L10 (QM), plant L10, Caenorhabditis elegans L10, yeast L10 (QSR1) and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0543.

    Proteins where this domain is known:
    PF14_0141   


    PTHR11726:SF1 - PTHR11726:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0141   


    PTHR11727 - RRNA_meth_trans (Panther link)

    Interpro entry IPR001737 : Ribosomal RNA adenine methylase transferase (Interpro link)

    Interpro description:

    This family of proteins include rRNA adenine dimethylases (e.g. KsgA) and the Erythromycin resistance methylases (Erm).

    The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.

    The crystal structure of KsgA from Escherichia coli has been solved to a resolution of 2.1A. It bears a strong similarity to the crystal structure of ErmC' from Bacillus stearothermophilus and a lesser similarity to the yeast mitochondrial transcription factor, sc-mtTFB.

    The Erm family of RNA methyltransferases, which methylate a single adenosine base in 23S rRNA confer resistance to the MLS-B group of antibiotics. Despite their sequence similarity, the two enzyme families have strikingly different levels of regulation that remain to be elucidated. Other orthologs, of this family include the yeast and Homo sapiens (Human) mitochondrial transcription factors (MTF1 and h-mtTFB respectively), which are nuclear encoded. Human-mtTFB is able to stimulate transcription in vitro independently of its S-adenosylmethionine binding and rRNA methyltransferase activity.

    Proteins where this domain is known:
    PF14_0156    PFL2395c   


    PTHR11728 - NAD_Gly3P_DH (Panther link)

    Interpro entry IPR006168 : NAD-dependent glycerol-3-phosphate dehydrogenase (Interpro link)

    Interpro description:
    NAD-dependent glycerol-3-phosphate dehydrogenase (GPD) catalyzes the reversible reduction of dihydroxyacetone phosphate to glycerol-3-phosphate. It is a cytoplasmic protein, active as a homodimer, each monomer containing an N-terminal NAD binding site. In insects, it acts in conjunction with a mitochondrial alpha-glycerophosphate oxidase in the alpha-glycerophosphate cycle, which is essential for the production of energy used in insect flight.

    Proteins where this domain is known:
    PF11_0157    PFL0780w   


    PTHR11731 - PTHR11731 (Panther link)

    Proteins where this domain is known:
    PFC0950c   


    PTHR11731:SF7 - PTHR11731:SF7 (Panther link)

    Proteins where this domain is known:
    PFC0950c   


    PTHR11732 - Aldo/ket_red (Panther link)

    Interpro entry IPR001395 : Aldo/keto reductase (Interpro link)

    Interpro description:

    The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.

    Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.

    Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.

    Proteins where this domain is known:
    MAL13P1.324    PF14_0088   


    PTHR11732:SF15 - PTHR11732:SF15 (Panther link)

    Proteins where this domain is known:
    MAL13P1.324   


    PTHR11732:SF6 - PTHR11732:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0088   


    PTHR11735 - Pept_M22_Osialgl (Panther link)

    Interpro entry IPR009180 : Peptidase M22, O-sialoglycoprotein endopeptidase (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of metallopeptidases belongs to MEROPS peptidase family M22 (clan MK). The Pasteurella haemolytica secreted O-sialoglycoprotein endopeptidase Gcp (glycoprotease; cleaves only proteins that are heavily sialylated, in particular those with sialylated serine and threonine residues. It does not cleave unglycosylated proteins, desialylated glycoproteins or glycoproteins that are only N-glycosylated.

    In some organisms, the O-sialoglycoprotein endopeptidase domain is fused to the serine/threonine protein kinase domain STYKS.

    Proteins where this domain is known:
    PF10_0299    PFD0440w   


    PTHR11739 - Citrate_synth (Panther link)

    Interpro entry IPR002020 : Citrate synthase-like (Interpro link)

    Interpro description:

    Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).

    Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.

    This entry represents types I and II citrate synthase enzymes, as well as the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial produced citrate.

    Proteins where this domain is known:
    PF10_0218    PFF0455w   


    PTHR11740 - CAS_kinase_II (Panther link)

    Interpro entry IPR000704 : Casein kinase II, regulatory subunit (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Casein kinase, a ubiquitous, well-conserved protein kinase involved in cell metabolism and differentiation, is characterised by its preference for Ser or Thr in acidic stretches of amino acids. The enzyme is a tetramer of 2 alpha- and 2 beta-subunits. However, some species (e.g., mammals) possess 2 related forms of the alpha-subunit (alpha and alpha'), while others (e.g., fungi) possess 2 related beta-subunits (beta and beta'). The alpha-subunit is the catalytic unit and contains regions characteristic of serine/threonine protein kinases. The beta-subunit is believed to be regulatory, possessing an N-terminal auto-phosphorylation site, an internal acidic domain, and a potential metal-binding motif. The beta subunit is a highly conserved protein of about 25 kD that contains, in its central section, a cysteine-rich motif, CX(n)C, that could be involved in binding a metal such as zinc. The mammalian beta-subunit gene promoter shares common features with those of other mammalian protein kinases and is closely related to the promoter of the regulatory subunit of cAMP-dependent protein kinase.

    Proteins where this domain is known:
    PF11_0048    PF13_0232   


    PTHR11741 - Transl_elong_EFTs/EF1B (Panther link)

    Interpro entry IPR001816 : Translation elongation factor EFTs/EF1B (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    This entry represents EF-Tu (EF1A) proteins found primarily in bacteria, mitochondria and chloroplasts.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PF14_0406    PFC0225c   


    PTHR11746 - PTHR11746 (Panther link)

    Proteins where this domain is known:
    MAL13P1.69   


    PTHR11749 - Ribul_P_3_epim (Panther link)

    Interpro entry IPR000056 : Ribulose-phosphate 3-epimerase (Interpro link)

    Interpro description:
    Ribulose-phosphate 3-epimerase (also known as pentose-5-phosphate 3-epimerase or PPE) is the enzyme that converts D-ribulose 5-phosphate into D-xylulose 5-phosphate in Calvin's reductive pentose phosphate cycle. In Ralstonia eutropha (Alcaligenes eutrophus) two copies of the gene coding for PPE are known, one is chromosomally encoded the other one is on a plasmid PPE has been found in a wide range of bacteria, archaebacteria, fungi and plants. All the proteins have from 209 to 241 amino acid residues. The enzyme has a TIM barrel structure.

    Proteins where this domain is known:
    PFL0960w   


    PTHR11752 - PTHR11752 (Panther link)

    Proteins where this domain is known:
    MAL13P1.166    PF14_0234    PF14_0370    PFD1060w    PFF0100w    PFI0165c    PFI0480w   


    PTHR11752:SF2 - PTHR11752:SF2 (Panther link)

    Proteins where this domain is known:
    PFF0100w    PFI0165c    PFI0480w   


    PTHR11752:SF3 - PTHR11752:SF3 (Panther link)

    Proteins where this domain is known:
    MAL13P1.166    PF14_0234   


    PTHR11752:SF6 - PTHR11752:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0370   


    PTHR11752:SF7 - PTHR11752:SF7 (Panther link)

    Proteins where this domain is known:
    PFD1060w   


    PTHR11753 - PTHR11753 (Panther link)

    Proteins where this domain is known:
    PF11_0187    PFB0805c    PFD1090c    PFL2425w   


    PTHR11753:SF2 - PTHR11753:SF2 (Panther link)

    Proteins where this domain is known:
    PFL2425w   


    PTHR11753:SF4 - PTHR11753:SF4 (Panther link)

    Proteins where this domain is known:
    PFD1090c   


    PTHR11753:SF5 - Clathrin_coat_assembly_AP19 (Panther link)

    Interpro entry IPR015604 : Clathrin adaptor AP1, sigma subunit (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents the small sigma subunit of clathrin adaptor AP1, which is also known as the AP19 subunit. The small sigma subunit of AP proteins have been characterised in several species. The sigma subunit plays a role in protein sorting in the late-Golgi/trans-Golgi network (TGN) and/or endosomes.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0187   


    PTHR11753:SF6 - PTHR11753:SF6 (Panther link)

    Proteins where this domain is known:
    PFB0805c   


    PTHR11758 - Ribosomal_S8 (Panther link)

    Interpro entry IPR000630 : Ribosomal protein S8 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups eubacterial, algal and plant chloroplast, cyanelle, archaebacterial and Marchantia polymorpha mitochondrial S8; mammalian and plant S15A; and yeast S22 (S24) ribosomal proteins.

    Proteins where this domain is known:
    MAL7P1.93    PFC0735w   


    PTHR11759 - Ribosomal_S11 (Panther link)

    Interpro entry IPR001971 : Ribosomal protein S11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S11 plays an essential role in selecting the correct tRNA in protein biosynthesis. It is located on the large lobe of the small ribosomal subunit. On the basis of sequence similarities, S11 belongs to a family of bacterial, archaeal and eukaryotic ribosomal proteins.

    Proteins where this domain is known:
    PFE0810c   


    PTHR11759:SF1 - PTHR11759:SF1 (Panther link)

    Proteins where this domain is known:
    PFE0810c   


    PTHR11760 - PTHR11760 (Panther link)

    Proteins where this domain is known:
    PF14_0627   


    PTHR11760:SF9 - PTHR11760:SF9 (Panther link)

    Proteins where this domain is known:
    PF14_0627   


    PTHR11761 - Ribosomal_L14 (Panther link)

    Interpro entry IPR000218 : Ribosomal protein L14b/L23e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.

    Proteins where this domain is known:
    PF13_0171    PFE0960w   


    PTHR11761:SF2 - PTHR11761:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0960w   


    PTHR11761:SF4 - PTHR11761:SF4 (Panther link)

    Proteins where this domain is known:
    PF13_0171   


    PTHR11766 - Tyr_tRNA-synt_1b (Panther link)

    Interpro entry IPR002307 : Tyrosyl-tRNA synthetase, class Ib, bacterial/mitochondrial (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Tyrosyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. Studies on tyrosyl-tRNA synthetase provide the first kinetic evidence that the 'KMSKS' motif plays a role in the initial binding of tRNA(Tyr) to tyrosyl-tRNA synthetase.

    Proteins where this domain is known:
    PF11_0181   


    PTHR11772 - PTHR11772 (Panther link)

    Proteins where this domain is known:
    PF13_0137    PFC0395w   


    PTHR11774 - PTHR11774 (Panther link)

    Proteins where this domain is known:
    PF11_0483    PFF0120w    PFL0695c   


    PTHR11774:SF2 - PTHR11774:SF2 (Panther link)

    Proteins where this domain is known:
    PFL0695c   


    PTHR11774:SF5 - PTHR11774:SF5 (Panther link)

    Proteins where this domain is known:
    PFF0120w   


    PTHR11774:SF6 - PTHR11774:SF6 (Panther link)

    Proteins where this domain is known:
    PF11_0483   


    PTHR11777 - PTHR11777 (Panther link)

    Proteins where this domain is known:
    PF13_0354   


    PTHR11778 - tRNA-synt_ser (Panther link)

    Interpro entry IPR018156 : Seryl-tRNA synthetase, class IIa, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Seryl-tRNA synthetase exists as monomer and belongs to class IIa.

    Proteins where this domain is known:
    PF07_0073    PFL0770w   


    PTHR11782 - GDA1_CD39_NTPase (Panther link)

    Interpro entry IPR000407 : Nucleoside phosphatase GDA1/CD39 (Interpro link)

    Interpro description:

    A number of nucleoside diphosphate and triphosphate hydrolases as well as some yet uncharacterised proteins have been found to belong to the same family. The uncharacterised proteins all seem to be membrane-bound.

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).

    Proteins where this domain is known:
    MAL13P1.121    PF14_0297   


    PTHR11782:SF2 - PTHR11782:SF2 (Panther link)

    Proteins where this domain is known:
    MAL13P1.121   


    PTHR11787 - Rab_GDI_REP (Panther link)

    Interpro entry IPR002005 : Rab GTPase activator (Interpro link)

    Interpro description:
    Rab proteins constitute a family of small GTPases that serve a regulatory role in vesicular membrane traffic; C-terminal geranylgeranylation is crucial for their membrane association and function. This post-translational modification is catalysed by Rab geranylgeranyl transferase (Rab-GGTase), a multi-subunit enzyme that contains a catalytic heterodimer and an accessory component, termed Rab escort protein (REP)-1. REP-1 presents newly- synthesised Rab proteins to the catalytic component, and forms a stable complex with the prenylated proteins following the transfer reaction.

    The mechanism of REP-1-mediated membrane association of Rab5 is similar to that mediated by Rab GDP dissociation inhibitor (GDI). REP-1 and Rab GDI also share other functional properties, including the ability to inhibit the release of GDP and to remove Rab proteins from membranes.

    The crystal structure of the bovine alpha-isoform of Rab GDI has been determined to a resolution of 1.81A. The protein is composed of two main structural units: a large complex multi-sheet domain I, and a smaller alpha-helical domain II.

    The structural organisation of domain I is closely related to FAD-containing monooxygenases and oxidases. Conserved regions common to GDI and the choroideraemia gene product, which delivers Rab to catalytic subunits of Rab geranylgeranyltransferase II, are clustered on one face of the domain. The two most conserved regions form a compact structure at the apex of the molecule; site-directed mutagenesis has shown these regions to play a critical role in the binding of Rab proteins.

    Proteins where this domain is known:
    PFL2060c   


    PTHR11800 - PTHR11800 (Panther link)

    Proteins where this domain is known:
    PF11_0445    PFI1130c   


    PTHR11800:SF13 - PTHR11800:SF13 (Panther link)

    Proteins where this domain is known:
    PF11_0445   


    PTHR11800:SF2 - PTHR11800:SF2 (Panther link)

    Proteins where this domain is known:
    PFI1130c   


    PTHR11804 - PTHR11804 (Panther link)

    Proteins where this domain is known:
    MAL13P1.184   


    PTHR11804:SF3 - PTHR11804:SF3 (Panther link)

    Proteins where this domain is known:
    MAL13P1.184   


    PTHR11806 - GIDA (Panther link)

    Interpro entry IPR002218 : Glucose-inhibited division protein A-related (Interpro link)

    Interpro description:

    GidA is a tRNA modification enzyme found in bacteria and mitochondria. Though its precise molecular function of these proteins is not known, it is involved in the 5-carboxymethylaminomethyl modification of the wobble uridine base in some tRNAs. Sequence variations in the human mitochondrial protein may influence the severity of aminoglycoside-induced deafness.

    This entry represents GidA and related proteins, such as Gid, whose functions are not known.

    Proteins where this domain is known:
    PFL2115c   


    PTHR11807 - PTHR11807 (Panther link)

    Proteins where this domain is known:
    PFF0610c   


    PTHR11807:SF3 - PTHR11807:SF3 (Panther link)

    Proteins where this domain is known:
    PFF0610c   


    PTHR11809 - PTHR11809 (Panther link)

    Proteins where this domain is known:
    PFB0545c    PFE1225w   


    PTHR11809:SF1 - PTHR11809:SF1 (Panther link)

    Proteins where this domain is known:
    PFB0545c   


    PTHR11811 - PTHR11811 (Panther link)

    Proteins where this domain is known:
    PF14_0520   


    PTHR11814 - PTHR11814 (Panther link)

    Proteins where this domain is known:
    PF14_0679   


    PTHR11814:SF5 - PTHR11814:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0679   


    PTHR11815 - CoA_lig_beta (Panther link)

    Interpro entry IPR005809 : Succinyl-CoA synthetase, beta subunit (Interpro link)

    Interpro description:

    There are four different enzymes that share a similar catalytic mechanism which involves the phosphorylation by ATP (or GTP) of a specific histidine residue in the active site. These enzymes are: ATP citrate-lyase, the primary enzyme responsible for the synthesis of cytosolic acetyl-CoA in many tissues, catalyzes the formation of acetyl-CoA and oxaloacetate from citrate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. ATP-citrate lyase is a tetramer of identical subunits; Succinyl-CoA ligase (GDP-forming) is a mitochondrial enzyme that catalyzes the substrate level phosphorylation step of the tricarboxylic acid cycle: the formation of succinyl-CoA from succinate with a concomitant hydrolysis of GTP to GDP and phosphate. This enzyme is a dimer composed of an alpha and a beta subunits; Succinyl-CoA ligase (ADP-forming) is a bacterial enzyme that during aerobic metabolism functions in the citric acid cycle, coupling the hydrolysis of succinyl-CoA to the synthesis of ATP. It can also function in the other direction for anabolic purposes. This enzyme is a tetramer composed of two alpha and two beta subunits; and Malate-CoA ligase (malyl-CoA synthetase), is a bacterial enzyme that forms malyl-CoA from malate and CoA with the concomitant hydrolysis of ATP to ADP and phosphate. Malate-CoA ligase is composed of two different subunits.

    Proteins where this domain is known:
    PF14_0295   


    PTHR11817 - Pyruvate_kinase (Panther link)

    Interpro entry IPR001697 : Pyruvate kinase (Interpro link)

    Interpro description:

    Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

     ADP + phosphoenolpyruvate = ATP + pyruvate 

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    Proteins where this domain is known:
    PF10_0363    PFF1300w   


    PTHR11821 - Hsp40/DnaJ_Rel (Panther link)

    Interpro entry IPR015609 : (Interpro link)

    Interpro description:

    The Escherichia coli Hsp40 DnaJ and Hsp70 DnaK cooperate in the binding of proteins at intermediate stages of folding, assembly, and translocation across membranes. Binding of protein substrates to the DnaK C-terminal domain is controlled by ATP-binding and hydrolysis in the N-terminal ATPase domain. The interaction of DnaJ with DnaK is mediated at least in part by the highly conserved N-terminal J-domain of DnaJ. The J-domain interaction is localised to the ATPase domain of DnaK and is likely to be dominated by electrostatic interactions. J-domain may tether DnaK to DnaJ-bound substrates, which DnaK then binds with its C-terminal peptide-binding domain. The peptide-binding domain of DnaJ is comprised of a beta sandwich made up of 6 beta-strands divided into 2 sheets.

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP-binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate-binding domain. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    Besides stimulating the ATPase activity of DnaK through its J-domain, DnaJ also associates with unfolded polypeptide chains and prevents their aggregation. Thus, DnaK and DnaJ may bind to one and the same polypeptide chain to form a ternary complex. The formation of a ternary complex may result in cis-interaction of the J-domain of DnaJ with the ATPase domain of DnaK. An unfolded polypeptide may enter the chaperone cycle by associating first either with ATP-liganded DnaK or with DnaJ. DnaK interacts with both the backbone and side chains of a peptide substrate; it thus shows binding polarity and admits only L-peptide segments. In contrast, DnaJ has been shown to bind both L- and D-peptides and is assumed to interact only with the side chains of the substrate.

    Proteins where this domain is known:
    MAL13P1.162    MAL13P1.277    MAL8P1.204    PF08_0032    PF08_0115    PF10_0058    PF10_0378    PF10_0381    PF11_0034    PF11_0099    PF11_0380    PF11_0443    PF11_0512    PF11_0513    PF13_0102    PF14_0013    PF14_0137    PF14_0359    PF14_0700    PFA0660w    PFB0085c    PFB0090c    PFB0595w    PFB0920w    PFB0925w    PFD0462w    PFE0055c    PFE0135w    PFE1170w    PFF1415c    PFI0935w    PFI0985c    PFL0055c    PFL0565w    PFL0815w    PFL2550w   


    PTHR11821:SF10 - PTHR11821:SF10 (Panther link)

    Proteins where this domain is known:
    PF10_0058   


    PTHR11821:SF12 - DNAJ-RELATED (Panther link)

    Proteins where this domain is known:
    MAL8P1.204    PF08_0115    PF10_0378    PFB0925w    PFL0055c    PFL2550w   


    PTHR11821:SF22 - PTHR11821:SF22 (Panther link)

    Proteins where this domain is known:
    PFB0085c   


    PTHR11821:SF29 - PTHR11821:SF29 (Panther link)

    Proteins where this domain is known:
    PFL0815w   


    PTHR11821:SF36 - PTHR11821:SF36 (Panther link)

    Proteins where this domain is known:
    PF11_0443   


    PTHR11821:SF41 - PTHR11821:SF41 (Panther link)

    Proteins where this domain is known:
    PFE0135w   


    PTHR11821:SF47 - PTHR11821:SF47 (Panther link)

    Proteins where this domain is known:
    PF11_0034   


    PTHR11821:SF48 - DNAJ-RELATED (Panther link)

    Proteins where this domain is known:
    PFL0565w   


    PTHR11821:SF49 - DNAJ-RELATED (Panther link)

    Proteins where this domain is known:
    PFI0985c   


    PTHR11821:SF52 - PTHR11821:SF52 (Panther link)

    Proteins where this domain is known:
    PF10_0381   


    PTHR11821:SF57 - PTHR11821:SF57 (Panther link)

    Proteins where this domain is known:
    PF11_0513    PF14_0013    PFB0920w    PFE1170w   


    PTHR11821:SF59 - PTHR11821:SF59 (Panther link)

    Proteins where this domain is known:
    PF11_0380   


    PTHR11821:SF60 - PTHR11821:SF60 (Panther link)

    Proteins where this domain is known:
    PF14_0700   


    PTHR11821:SF61 - PTHR11821:SF61 (Panther link)

    Proteins where this domain is known:
    MAL13P1.162    PF08_0032   


    PTHR11821:SF68 - PTHR11821:SF68 (Panther link)

    Proteins where this domain is known:
    PF11_0099   


    PTHR11821:SF7 - PTHR11821:SF7 (Panther link)

    Proteins where this domain is known:
    PF13_0102   


    PTHR11821:SF72 - PTHR11821:SF72 (Panther link)

    Proteins where this domain is known:
    PF11_0512    PF14_0137    PF14_0359   


    PTHR11821:SF74 - PTHR11821:SF74 (Panther link)

    Proteins where this domain is known:
    PFF1415c   


    PTHR11821:SF75 - PTHR11821:SF75 (Panther link)

    Proteins where this domain is known:
    PFD0462w   


    PTHR11821:SF79 - PTHR11821:SF79 (Panther link)

    Proteins where this domain is known:
    MAL13P1.277    PFI0935w   


    PTHR11821:SF87 - PTHR11821:SF87 (Panther link)

    Proteins where this domain is known:
    PFA0660w    PFB0090c    PFB0595w    PFE0055c   


    PTHR11822 - IDH_NADP_euk (Panther link)

    Interpro entry IPR004790 : Isocitrate dehydrogenase NADP-dependent, eukaryotic (Interpro link)

    Interpro description:

    Isocitrate dehydrogenase (IDH) is an important enzyme of carbohydrate metabolism which catalyzes the oxidative decarboxylation of isocitrate into alpha-ketoglutarate. IDH is either dependent on NAD+ or on NADP+. In eukaryotes there are at least three isozymes of IDH: two are located in the mitochondrial matrix (one NAD+-dependent, the other NADP+-dependent), while the third one (also NADP+-dependent) is cytoplasmic. In Escherichia coli the activity of a NADP+-dependent form of the enzyme is controlled by the phosphorylation of a serine residue; the phosphorylated form of IDH is completely inactivated.

    The eukaryotic, NADP-dependent isocitrate dehydrogenases, are defined by this group that includes the cytosolic, mitochondrial, and chloroplast enzymes, but does also hit a small number of bacterial proteins.

    Proteins where this domain is known:
    PF13_0242   


    PTHR11825 - Aminotrans_IV (Panther link)

    Interpro entry IPR001544 : Aminotransferase, class IV (Interpro link)

    Interpro description:

    Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies.

    One of these, called class-IV, currently consists of proteins of about 270 to 415 amino-acid residues that share a few regions of sequence similarity. Surprisingly, the best conserved region does not include the lysine residue to which the pyridoxal-phosphate group is known to be attached, in ilvE, but is located some 40 residues at the C terminus side of the pyridoxal-phosphate-lysine. The D-amino acid transferases (D-AAT), which are among the members of this entry, are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.

    Proteins where this domain is known:
    PF14_0557   


    PTHR11825:SF8 - PTHR11825:SF8 (Panther link)

    Proteins where this domain is known:
    PF14_0557   


    PTHR11830 - Ribosomal_S3AE (Panther link)

    Interpro entry IPR001593 : Ribosomal protein S3Ae (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of proteins that have from 220 to 250 amino acids.

    Proteins where this domain is known:
    PFC1020c   


    PTHR11831 - Ribosomal_S4 (Panther link)

    Interpro entry IPR001912 : Ribosomal protein S4 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S4 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S4 is known to bind directly to 16S ribosomal RNA. Mutations in S4 have been shown to increase translational error frequencies. S4 is a protein of 171 to 205 amino-acid residues (except for NAM9, which is much larger). The crystal structure of a bacterial S4 protein revealed a two domain molecule. The first domain is composed of four helices in the known structure. The second domain is in the middle of the first one and displays some structural homology with the ETS DNA binding domain. This family includes small ribosomal subunit S4 from prokaryotes and S9 from animals.

    Proteins where this domain is known:
    PF14_0584    PFE1005w   


    PTHR11831:SF1 - PTHR11831:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0584   


    PTHR11831:SF3 - PTHR11831:SF3 (Panther link)

    Proteins where this domain is known:
    PFE1005w   


    PTHR11842 - PTHR11842 (Panther link)

    Proteins where this domain is known:
    PF10_0227   


    PTHR11842:SF10 - PTHR11842:SF10 (Panther link)

    Proteins where this domain is known:
    PF10_0227   


    PTHR11843 - Ribosomal_S12e (Panther link)

    Interpro entry IPR000530 : Ribosomal protein S12e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. The small ribosomal subunit protein S12 contains 130-150 amino acid residues, and is thought to be involved in the translation initiation step. This family consists of eukaryotic S12 ribosomal proteins, including those from vertebrates, Trypanosoma brucei, Caenorhabditis elegans, Drosophila and Saccharomyces cerevisiae (Baker's yeast).

    Proteins where this domain is known:
    PFC0295c   


    PTHR11846 - Asucc_synthtase (Panther link)

    Interpro entry IPR001114 : Adenylosuccinate synthetase (Interpro link)

    Interpro description:

    Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalysing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterised from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present: one involved in purine biosynthesis and the other in the purine nucleotide cycle.

    The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein.

    Proteins where this domain is known:
    PF13_0287   


    PTHR11847 - Ribosomal_L15e (Panther link)

    Interpro entry IPR000439 : Ribosomal protein L15e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial ribosomal proteins can be grouped on the basis of sequence similarities. One of these families consists of:

  • Mammalian L15.
  • Insect L15.
  • Plant L15.
  • Yeast YL10 (L13) (Rp15r).
  • Archaebacterial L15e.
  • These proteins have about 200 amino acid residues.

    Proteins where this domain is known:
    PFD0770c   


    PTHR11847:SF1 - PTHR11847:SF1 (Panther link)

    Proteins where this domain is known:
    PFD0770c   


    PTHR11851 - PTHR11851 (Panther link)

    Proteins where this domain is known:
    PF11_0189    PF13_0322    PF14_0382    PFE1155c    PFI1625c   


    PTHR11851:SF34 - PTHR11851:SF34 (Panther link)

    Proteins where this domain is known:
    PF14_0382   


    PTHR11851:SF49 - PTHR11851:SF49 (Panther link)

    Proteins where this domain is known:
    PFE1155c   


    PTHR11851:SF58 - MITOCHONDRIAL PROCESSING PEPTIDASE BETA SUBUNIT (Panther link)

    Proteins where this domain is known:
    PFI1625c   


    PTHR11851:SF68 - METALLOPROTEASE 1-RELATED (Panther link)

    Proteins where this domain is known:
    PF11_0189    PF13_0322   


    PTHR11864 - PTHR11864 (Panther link)

    Proteins where this domain is known:
    PF13_0091   


    PTHR11875 - NAP_family (Panther link)

    Interpro entry IPR002164 : Nucleosome assembly protein (NAP) (Interpro link)

    Interpro description:

    It is thought that NAPs act as histone chaperones, shuttling both core and linker histones from their site of synthesis in the cytoplasm to the nucleus. The proteins may be involved in regulating gene expression and therefore cellular differentiation.

    The centrosomal protein c-Nap1, also known as Cep250, has been implicated in the cell-cycle-regulated cohesion of microtubule-organizing centres. This 281 kDa protein consists mainly of domains predicted to form coiled coil structures. The C-terminal region defines a novel histone-binding domain that is responsible for targeting CNAP1, and possibly condensin, to mitotic chromosomes. During interphase, C-Nap1 localizes to the proximal ends of both parental centrioles, but it dissociates from these structures at the onset of mitosis. Re-association with centrioles then occurs in late telophase or at the very beginning of G1 phase, when daughter cells are still connected by post-mitotic bridges. Electron microscopic studies performed on isolated centrosomes suggest that a proteinaceous linker connects parental centrioles and C-Nap1 may be part of a linker structure that assures the cohesion of duplicated centrosomes during interphase, but that is dismantled upon centrosome separation at the onset of mitosis.

    Proteins where this domain is known:
    PFI0930c    PFL0185c   


    PTHR11875:SF7 - PTHR11875:SF7 (Panther link)

    Proteins where this domain is known:
    PFL0185c   


    PTHR11875:SF9 - SET (Panther link)

    Proteins where this domain is known:
    PFI0930c   


    PTHR11879 - Asp_trans (Panther link)

    Interpro entry IPR000796 : Aspartate/other aminotransferase (Interpro link)

    Interpro description:
    Aspartate aminotransferase is important for the metabolism of amino acids and Krebs-cycle related organic acids. In plants, it is involved in nitrogen metabolism and in aspects of carbon and energy metabolism. The enzyme catalyses the reaction:
     L-aspartate + 2-oxoglutarate = oxaloacetate + L-glutamate 
    Aminotransferases share certain mechanistic features with other pyridoxal-phosphate-dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue . This family includes some aromatic-amino-acid aminotransferases too.

    Proteins where this domain is known:
    PFB0200c   


    PTHR11880 - Ribosomal_S19 (Panther link)

    Interpro entry IPR002222 : Ribosomal protein S19/S15 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).

    Proteins where this domain is known:
    MAL13P1.92   


    PTHR11880:SF2 - Ribosomal_S15e/a (Panther link)

    Interpro entry IPR005713 : Ribosomal protein S15, eukaryotic/archaeal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family represents eukaryotic ribosomal protein S15 and its archaeal equivalent. It excludes bacterial and organellar ribosomal protein S19. The nomenclature for the archaeal members is unresolved and given variously as S19 (after the more distant bacterial homologs) or S15.

    Proteins where this domain is known:
    MAL13P1.92   


    PTHR11885 - PTHR11885 (Panther link)

    Proteins where this domain is known:
    PF13_0316   


    PTHR11885:SF1 - PTHR11885:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0316   


    PTHR11886 - Dynein_light1 (Panther link)

    Interpro entry IPR001372 : Dynein light chain, type 1 and 2 (Interpro link)

    Interpro description:

    Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.

    Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.

    Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).

    Proteins where this domain is known:
    MAL7P1.161    PF13_0306    PFL0660w   


    PTHR11895 - Amidase (Panther link)

    Interpro entry IPR000120 : Amidase signature enzyme (Interpro link)

    Interpro description:

    Amidase signature (AS) enzymes are a large group of hydrolytic enzymes that contain a conserved stretch of approximately 130 amino acids known as the AS sequence. They are widespread, being found in both prokaryotes and eukaryotes. AS enzymes catalyse the hydrolysis of amide bonds (CO-NH2), although the family has diverged widely with regard to substrate specificity and function. Nonetheless, these enzymes maintain a core alpha/beta/alpha structure, where the topologies of the N- and C-terminal halves are similar. AS enzymes characteristically have a highly conserved C-terminal region rich in serine and glycine residues, but devoid of aspartic acid and histidine residues, therefore they differ from classical serine hydrolases. These enzymes posses a unique, highly conserved Ser-Ser-Lys catalytic triad used for amide hydrolysis, although the catalytic mechanism for acyl-enzyme intermediate formation can differ between enzymes.

    Examples of AS enzymes include:

    Proteins where this domain is known:
    PFD0780w   


    PTHR11895:SF7 - PTHR11895:SF7 (Panther link)

    Proteins where this domain is known:
    PFD0780w   


    PTHR11896 - Mitoch_carrier (Panther link)

    Interpro entry IPR001993 : Mitochondrial substrate carrier (Interpro link)

    Interpro description:

    A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.

    Proteins where this domain is known:
    PF08_0031    PF08_0093    PF10_0051    PF10_0366    PF13_0359    PFA0415c    PFA0435w    PFD0367w    PFI0255c    PFI0425w    PFL0110c    PFL1145w    PFL2000w   


    PTHR11896:SF13 - PTHR11896:SF13 (Panther link)

    Proteins where this domain is known:
    PF10_0051   


    PTHR11896:SF22 - MITOCHONDRIAL PHOSPHATE CARRIER PROTEIN (Panther link)

    Proteins where this domain is known:
    PFL0110c   


    PTHR11896:SF34 - PTHR11896:SF34 (Panther link)

    Proteins where this domain is known:
    PFI0425w   


    PTHR11896:SF35 - PTHR11896:SF35 (Panther link)

    Proteins where this domain is known:
    PF10_0366    PF13_0359   


    PTHR11896:SF38 - PTHR11896:SF38 (Panther link)

    Proteins where this domain is known:
    PFA0435w   


    PTHR11896:SF43 - PTHR11896:SF43 (Panther link)

    Proteins where this domain is known:
    PFD0367w   


    PTHR11896:SF45 - PTHR11896:SF45 (Panther link)

    Proteins where this domain is known:
    PFL2000w   


    PTHR11896:SF77 - PTHR11896:SF77 (Panther link)

    Proteins where this domain is known:
    PFI0255c   


    PTHR11896:SF78 - PTHR11896:SF78 (Panther link)

    Proteins where this domain is known:
    PF08_0031   


    PTHR11896:SF80 - PTHR11896:SF80 (Panther link)

    Proteins where this domain is known:
    PF08_0093   


    PTHR11896:SF82 - PTHR11896:SF82 (Panther link)

    Proteins where this domain is known:
    PFL1145w   


    PTHR11902 - Enolase (Panther link)

    Interpro entry IPR000941 : Enolase (Interpro link)

    Interpro description:

    Enolase (2-phospho-D-glycerate hydrolase) is an essential glycolytic enzyme that catalyses the interconversion of 2-phosphoglycerate and phosphoenolpyruvate. In vertebrates, there are 3 different, tissue-specific isoenzymes, designated alpha, beta and gamma. Alpha is present in most tissues, beta is localised in muscle tissue, and gamma is found only in nervous tissue. The functional enzyme exists as a dimer of any 2 isoforms. In immature organs and in adult liver, it is usually an alpha homodimer, in adult skeletal muscle, a beta homodimer, and in adult neurons, a gamma homodimer. In developing muscle, it is usually an alpha/beta heterodimer, and in the developing nervous system, an alpha/gamma heterodimer. The tissue specific forms display minor kinetic differences. Tau-crystallin, one of the major lens proteins in some fish, reptiles and birds, has been shown to be evolutionary related to enolase.

    Neuron-specific enolase is released in a variety of neurological diseases, such as multiple sclerosis and after seizures or acute stroke. Several tumour cells have also been found positive for neuron-specific enolase. Beta-enolase deficiency is associated with glycogenosis type XIII defect.

    Proteins where this domain is known:
    PF10_0155   


    PTHR11909 - PTHR11909 (Panther link)

    Proteins where this domain is known:
    PF11_0377   


    PTHR11909:SF18 - PTHR11909:SF18 (Panther link)

    Proteins where this domain is known:
    PF11_0377   


    PTHR11910 - ATPase_F1_OSCP/d (Panther link)

    Interpro entry IPR000711 : ATPase, F1 complex, OSCP/delta subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    MAL13P1.47   


    PTHR11911 - PTHR11911 (Panther link)

    Proteins where this domain is known:
    PFI1020c   


    PTHR11911:SF6 - IMP_DH_GMPRtase (Panther link)

    Proteins where this domain is known:
    PFI1020c   


    PTHR11913 - PTHR11913 (Panther link)

    Proteins where this domain is known:
    PF13_0326    PFE0165w   


    PTHR11913:SF12 - PTHR11913:SF12 (Panther link)

    Proteins where this domain is known:
    PF13_0326    PFE0165w   


    PTHR11918 - UPF0004 (Panther link)

    Interpro entry IPR005839 : (Interpro link)

    Interpro description:

    This entry represents a family defined on the basis of sequence similarity. Most of these proteins are not yet characterised, but those that are include

    The size of proteins in this entry ranges from 47 to 61 kDa and they contain six conserved cysteines, three of which are clustered.

    Proteins where this domain is known:
    PFF1070c   


    PTHR11921 - SUCCINATE DEHYDROGENASE IRON-SULFUR PROTEIN (Panther link)

    Proteins where this domain is known:
    PFL0630w   


    PTHR11922 - PTHR11922 (Panther link)

    Proteins where this domain is known:
    PF10_0123   


    PTHR11922:SF2 - PTHR11922:SF2 (Panther link)

    Proteins where this domain is known:
    PF10_0123   


    PTHR11931 - Phosphogly_mut1 (Panther link)

    Interpro entry IPR005952 : Phosphoglycerate mutase 1 (Interpro link)

    Interpro description:

    Most members of this family are phosphoglycerate mutase. This enzyme interconverts 2-phosphoglycerate and 3-phosphoglycerate.

      2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate.
    The enzyme is transiently phosphorylated on an active site histidine by 2,3-diphosphoglyerate, which is both substrate and product. Some members of this family have are phosphoglycerate mutase as a minor activity and act primarily as a bisphoglycerate mutase, interconverting 2,3-diphosphoglycerate and 1,3-diphosphoglycerate.

    Proteins where this domain is known:
    PF11_0208   


    PTHR11932 - PTHR11932 (Panther link)

    Proteins where this domain is known:
    PF08_0094    PFF1445c   


    PTHR11932:SF5 - PTHR11932:SF5 (Panther link)

    Proteins where this domain is known:
    PFF1445c   


    PTHR11932:SF9 - PTHR11932:SF9 (Panther link)

    Proteins where this domain is known:
    PF08_0094   


    PTHR11933 - TrmU_mtfrase (Panther link)

    Interpro entry IPR004506 : tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase (Interpro link)

    Interpro description:
    tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase catalyses the addition of 5-methylaminomethyl-2-thiouridylate to tRNAs using S-adenosyl-L-methionine as a substrate and releasing S-adenosyl-L-homocysteine. The enzyme is cytoplasmic and is involved in tRNA processing.

    Proteins where this domain is known:
    PF10_0191   


    PTHR11934 - RpiA (Panther link)

    Interpro entry IPR004788 : Ribose 5-phosphate isomerase (Interpro link)

    Interpro description:

    Ribose 5-phosphate isomerase, also known as phosphoriboisomerase, catalyses the conversion of D-ribose 5-phosphate to D-ribulose 5-phosphate in the non-oxidative branch of the pentose phosphate pathway. The pentose phosphate pathway is a target for chemotherapy against Chagas disease. This family of enzymes is coded for by two genes and is found in many taxa except the viruses. It is a highly conserved enzyme.

    Proteins where this domain is known:
    PFE0730c   


    PTHR11935 - BETA LACTAMASE DOMAIN (Panther link)

    Proteins where this domain is known:
    PFD0311w    PFL0285w   


    PTHR11935:SF7 - HYDROXYACYLGLUTATHIONE HYDROLASE (Panther link)

    Proteins where this domain is known:
    PFD0311w    PFL0285w   


    PTHR11937 - Actin_like (Panther link)

    Interpro entry IPR004000 : Actin/actin-like (Interpro link)

    Interpro description:

    Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.

    In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.

    Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.

    Proteins where this domain is known:
    MAL7P1.153    PF07_0077    PF11_0047    PF11_0114    PF14_0124    PF14_0218    PFA0190c    PFD0487c    PFE0255w    PFI0520w    PFL2215w   


    PTHR11937:SF14 - PTHR11937:SF14 (Panther link)

    Proteins where this domain is known:
    MAL7P1.153   


    PTHR11937:SF15 - PTHR11937:SF15 (Panther link)

    Proteins where this domain is known:
    PF11_0047   


    PTHR11937:SF17 - PTHR11937:SF17 (Panther link)

    Proteins where this domain is known:
    PFE0255w   


    PTHR11937:SF21 - PTHR11937:SF21 (Panther link)

    Proteins where this domain is known:
    PF07_0077   


    PTHR11937:SF27 - PTHR11937:SF27 (Panther link)

    Proteins where this domain is known:
    PF14_0218   


    PTHR11937:SF29 - PTHR11937:SF29 (Panther link)

    Proteins where this domain is known:
    PF11_0114   


    PTHR11937:SF46 - PTHR11937:SF46 (Panther link)

    Proteins where this domain is known:
    PFA0190c   


    PTHR11937:SF5 - PTHR11937:SF5 (Panther link)

    Proteins where this domain is known:
    PFD0487c   


    PTHR11938 - FAD NADPH DEHYDROGENASE/OXIDOREDUCTASE (Panther link)

    Proteins where this domain is known:
    PF11_0407    PF14_0334    PFF0160c   


    PTHR11938:SF1 - PTHR11938:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0334   


    PTHR11938:SF4 - PTHR11938:SF4 (Panther link)

    Proteins where this domain is known:
    PF11_0407   


    PTHR11938:SF7 - DIHYDROOROTATE DEHYDROGENASE (Panther link)

    Proteins where this domain is known:
    PFF0160c   


    PTHR11939 - ATPase_P (Panther link)

    Interpro entry IPR001757 : ATPase, P-type, K/Mg/Cd/Cu/Zn/Na/Ca/Na/H-transporter (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    P-ATPases (sometime known as E1-E2 ATPases) are found in bacteria and in a number of eukaryotic plasma membranes and organelles. P-ATPases function to transport a variety of different compounds, including ions and phospholipids, across a membrane using ATP hydrolysis for energy. There are many different classes of P-ATPases, each of which transports a specific type of ion: H+, Na+, K+, Mg2+, Ca2+, Ag+ and Ag2+, Zn2+, Co2+, Pb2+, Ni2+, Cd2+, Cu+ and Cu2+. P-ATPases can be composed of one or two polypeptides, and can usually assume two main conformations called E1 and E2.

    This entry represents the several classes of P-type ATPases, including those that transport K+, Mg2+, Cd2+, Cu 2+, Zn2+, Na+, Ca2+, Na+/K+, and H+/K+. These P-ATPases are found in both prokaryotes and eukaryotes.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    MAL13P1.246    MAL13P1.301    PF07_0115    PF11_0395    PF14_0736    PFA0310c    PFC0840w    PFD0555c    PFE0195w    PFE0805w    PFI0240c    PFL0590c    PFL0950c    PFL1125w   


    PTHR11939:SF10 - PTHR11939:SF10 (Panther link)

    Proteins where this domain is known:
    PFC0840w   


    PTHR11939:SF15 - gb def: ATPase 2 (Panther link)

    Proteins where this domain is known:
    PFL0950c   


    PTHR11939:SF25 - PTHR11939:SF25 (Panther link)

    Proteins where this domain is known:
    PFI0240c   


    PTHR11939:SF46 - PTHR11939:SF46 (Panther link)

    Proteins where this domain is known:
    PFE0195w   


    PTHR11939:SF48 - PTHR11939:SF48 (Panther link)

    Proteins where this domain is known:
    PF07_0115   


    PTHR11939:SF56 - PTHR11939:SF56 (Panther link)

    Proteins where this domain is known:
    PFE0805w   


    PTHR11939:SF6 - PTHR11939:SF6 (Panther link)

    Proteins where this domain is known:
    MAL13P1.301   


    PTHR11939:SF7 - PTHR11939:SF7 (Panther link)

    Proteins where this domain is known:
    PFL1125w   


    PTHR11939:SF76 - PTHR11939:SF76 (Panther link)

    Proteins where this domain is known:
    MAL13P1.246   


    PTHR11939:SF8 - PTHR11939:SF8 (Panther link)

    Proteins where this domain is known:
    PF11_0395   


    PTHR11939:SF82 - PTHR11939:SF82 (Panther link)

    Proteins where this domain is known:
    PF14_0736    PFD0555c    PFL0590c   


    PTHR11939:SF89 - gb def: Calcium-translocating P-type ATPase, SERCA-type (Panther link)

    Proteins where this domain is known:
    PFA0310c   


    PTHR11941 - PTHR11941 (Panther link)

    Proteins where this domain is known:
    PF10_0167    PF14_0232    PFL1940w   


    PTHR11946 - PTHR11946 (Panther link)

    Proteins where this domain is known:
    MAL8P1.125    PF08_0011    PF10_0053    PF10_0340    PF13_0179    PF14_0401    PF14_0589    PFC0470w    PFF1095w   


    PTHR11946:SF1 - PTHR11946:SF1 (Panther link)

    Proteins where this domain is known:
    PF10_0053    PF10_0340   


    PTHR11946:SF3 - PTHR11946:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0401   


    PTHR11946:SF5 - tRNA-synt_val (Panther link)

    Interpro entry IPR002303 : Valyl-tRNA synthetase, class Ia (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Valyl-tRNA synthetase is an alpha monomer that belongs to class Ia.

    Proteins where this domain is known:
    PF14_0589   


    PTHR11946:SF6 - PTHR11946:SF6 (Panther link)

    Proteins where this domain is known:
    PFF1095w   


    PTHR11946:SF7 - Leu_tRNAsyn_1a (Panther link)

    Interpro entry IPR002302 : Leucyl-tRNA synthetase, class Ia, bacterial/mitochondrial (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Leucyl tRNA synthetase is an alpha monomer that belongs to class Ia. There are two different families of leucyl-tRNA synthetases. This family includes the eubacterial and mitochondrial synthetases. The crystal structure of leucyl-tRNA synthetase from the hyperthermophile Thermus thermophilus has an overall architecture that is similar to that of isoleucyl-tRNA synthetase, except that the putative editing domain is inserted at a different position in the primary structure. This feature is unique to prokaryote-like leucyl-tRNA synthetases, as is the presence of a novel additional flexibly inserted domain.

    Proteins where this domain is known:
    PF08_0011   


    PTHR11946:SF8 - Tyr-tRNA_synth (Panther link)

    Interpro entry IPR015624 : Tyrosyl-tRNA synthetase, class Ib, archaeal/eukaryotic cytosolic (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Tyrosyl-tRNA synthetase is an alpha2 dimer that belongs to class Ib. Studies on tyrosyl-tRNA synthetase provide the first kinetic evidence that the 'KMSKS' motif plays a role in the initial binding of tRNA (Tyr) to tyrosyl-tRNA synthetase.

    Proteins where this domain is known:
    MAL8P1.125   


    PTHR11946:SF9 - Ile-tRNA-synt_Ia (Panther link)

    Interpro entry IPR018353 : (Interpro link)

    Interpro description:

    Isoleucyl-tRNA synthetase is an alpha monomer that belongs to class Ia. The enzyme, isoleucyl-transfer RNA synthetase, activates not only the cognate substrate L-isoleucine but also the minimally distinct L-valine in the first, aminoacylation step. Then, in a second, "editing" step, the synthetase itself rapidly hydrolyses only the valylated products as shown from the crystal structures.

    Proteins where this domain is known:
    PF13_0179   


    PTHR11952 - UDPGP_trans (Panther link)

    Interpro entry IPR002618 : UTP--glucose-1-phosphate uridylyltransferase (Interpro link)

    Interpro description:
    This family consists of UTP--glucose-1-phosphate uridylyltransferases. Also known as UDP-glucose pyrophosphorylase (UDPGP) and Glucose-1-phosphate uridylyltransferase. UTP--glucose-1-phosphate uridylyltransferase catalyses the interconversion of MgUTP + glucose-1-phosphate and UDP-glucose + MgPPi. UDP-glucose is an important intermediate in mammalian carbohydrate interconversion involved in various metabolic roles depending on tissue type. In Dictyostelium discoideum (Slime mold), mutants in this enzyme abort the development cycle. Also within this family is UDP-N-acetylglucosamine pyrophosphorylase and two hypothetical proteins from Borrelia burgdorferi, the Lyme disease spirochaete.

    Proteins where this domain is known:
    MAL13P1.218    PFE0875c   


    PTHR11952:SF2 - PTHR11952:SF2 (Panther link)

    Proteins where this domain is known:
    MAL13P1.218    PFE0875c   


    PTHR11953 - PTHR11953 (Panther link)

    Proteins where this domain is known:
    PF14_0256    PFB0415c   


    PTHR11954 - MIF (Panther link)

    Interpro entry IPR001398 : (Interpro link)

    Interpro description:

    Macrophage migration inhibitory factor (MIF) is a key regulatory cytokine within innate and adaptive immune responses, capable of promoting and modulating the magnitude of the response. MIF is released from T-cells and macrophages, and acts within the neuroendocrine system. MIF is capable of tautomerase activity, although its biological function has not been fully characterised. It is induced by glucocorticoid and is capable of overriding the anti-inflammatory actions of glucocorticoid. MIF regulates cytokine secretion and the expression of receptors involved in the immune response. It can be taken up into target cells in order to interact with intracellular signalling molecules, inhibiting p53 function, and/or activating components of the mitogen-activated protein kinase and Jun-activation domain-binding protein-1 (Jab-1). MIF has been linked to various inflammatory diseases, such as rheumatoid arthritis and atherosclerosis.

    The MIF homologue D-dopachrome tautomerase is involved in detoxification through the conversion of dopaminechrome (and possibly norepinephrinechrome), the toxic quinine product of the neurotransmitter dopamine (and norepinephrine), to an indole derivative that can serve as a precursor to neuromelanin.

    Proteins where this domain is known:
    PFL1420w   


    PTHR11954:SF1 - gb def: Hypothetical protein (Panther link)

    Proteins where this domain is known:
    PFL1420w   


    PTHR11956 - Arg_tRNA-synt_1c (Panther link)

    Interpro entry IPR001278 : Arginyl-tRNA synthetase, class Ic (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Arginyl-tRNA synthetase has been crystallized and preliminary X-ray crystallographic analysis of yeast arginyl-tRNA synthetase-yeast tRNAArg complexes is available.

    Proteins where this domain is known:
    PFI0680c    PFL0900c   


    PTHR11960 - TIF_eIF_4E (Panther link)

    Interpro entry IPR001040 : Eukaryotic translation initiation factor 4E (eIF-4E) (Interpro link)

    Interpro description:
    Eukaryotic translation initiation factor 4E (eIF-4E) is a protein that binds to the cap structure of eukaryotic cellular mRNAs. eIF-4E recognises and binds the 7-methylguanosine-containing (m7Gppp) cap during an early step in the initiation of protein synthesis and facilitates ribosome binding to a mRNA by inducing the unwinding of its secondary structures. A tryptophan in the central part of the sequence of human eIF-4E seems to be implicated in cap-binding.

    Proteins where this domain is known:
    PFC0635c   


    PTHR11961 - Cyt_CIAB (Panther link)

    Interpro entry IPR002327 : Cytochrome c, class IA/ IB (Interpro link)

    Interpro description:
    Cytochromes c (cytC) can be defined as electron-transfer proteins having one or several haem c groups, bound to the protein by one or, more generally, two thioether bonds involving sulphydryl groups of cysteine residues. The fifth haem iron ligand is always provided by a histidine residue. CytC possess a wide range of properties and function in a large number of different redox processes.

    Ambler recognised four classes of cytC.

    Class I includes the low-spin soluble cytC of mitochondria and bacteria, with the haem-attachment site towards the N-terminus, and the sixth ligand provided by a methionine residue about 40 residues further on towards the C-terminus. On the basis of sequence similarity, class I cytC were further subdivided into five classes, IA to IE. Class IB includes the eukaryotic mitochondrial cyt C and prokaryotic 'short' cyt C2 exemplified by Rhodopila globiformis cyt C2; Class IA includes 'long' cyt C2, such as Rhodospirillum rubrum cyt C2 and Aquaspirillum itersonii cyt C-550, which have several extra loops by comparison with Class IB cyt C.

    The 3D structures of a considerable number of class IA and IB cytC have been determined. The proteins consist of 3-6 alpha-helices; the three most conserved 'core' helices form a 'basket' around the haem group, with one haem edge exposed to the solvent. Most class I cytC have conserved aromatic residues clustered around the haem and axial ligands.

    Proteins where this domain is known:
    MAL13P1.55    PF14_0038   


    PTHR11962 - tRNA_ribo_trans (Panther link)

    Interpro entry IPR002616 : Queuine/other tRNA-ribosyltransferase (Interpro link)

    Interpro description:
    This is a family of queuine, archaeosine and general tRNA-ribosyltransferases also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine at position 34 and with archaeosine at position 15 in archaeal tRNAs. In bacterial it catalyses the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues.

    Proteins where this domain is known:
    PF07_0071    PF14_0322    PFL2030w   


    PTHR11963 - PTHR11963 (Panther link)

    Proteins where this domain is known:
    PF14_0439   


    PTHR11963:SF3 - Peptidase_M17 (Panther link)

    Interpro entry IPR011356 : Peptidase M17, leucyl aminopeptidase (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    The majority of members of this family are zinc-dependent exopeptidases belonging to MEROPS peptidase family M17 (leucyl aminopeptidase, clan MF). This family excludes pepB aminopeptidases, which are also members of MEROPS family M17 (see.

    Leucyl aminopeptidase (LAP; selectively release N-terminal amino acid residues from polypeptides and proteins; in general they are involved in the processing, catabolism and degradation of intracellular proteins. Leucyl aminopeptidase forms a homohexamer containing two trimers stacked on top of one another. Each monomer binds two zinc ions. The zinc-binding and catalytic sites are located within the C-terminal catalytic domain. Leucine aminopeptidase has been shown to be identical with prolyl aminopeptidase in mammals.

    Interestingly, members of this group are also implicated in transcriptional regulation and are thought to combine catalytic and regulatory properties. The N-terminal domain of these proteins has been shown in Escherichia coli PepA to function as a DNA-binding protein in Xer site-specific recombination and in transcriptional control of the carAB operon. It is not well conserved and in some members can be found only by PSI-BLAST (after 4-6 iterations). It is not clear if the DNA binding function is preserved in all or even in most of the members.

    For additional information please see.

    Proteins where this domain is known:
    PF14_0439   


    PTHR11964 - S-AdoMet_synt (Panther link)

    Interpro entry IPR002133 : S-adenosylmethionine synthetase (Interpro link)

    Interpro description:

    S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.

    In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.

    The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.

    Proteins where this domain is known:
    PFI1090w   


    PTHR11968 - PTHR11968 (Panther link)

    Proteins where this domain is known:
    MAL13P1.485    PF07_0129    PF10_0090    PF14_0751    PF14_0761    PFB0685c    PFB0695c    PFC0050c    PFD0085c    PFE1250w    PFF0945c    PFF1350c    PFL0035c    PFL1880w    PFL2570w   


    PTHR11968:SF42 - PTHR11968:SF42 (Panther link)

    Proteins where this domain is known:
    PFF1350c   


    PTHR11968:SF49 - PTHR11968:SF49 (Panther link)

    Proteins where this domain is known:
    PFF0945c   


    PTHR11968:SF8 - PTHR11968:SF8 (Panther link)

    Proteins where this domain is known:
    MAL13P1.485    PF07_0129    PF14_0751    PF14_0761    PFB0685c    PFB0695c    PFC0050c    PFD0085c    PFE1250w    PFL0035c    PFL1880w    PFL2570w   


    PTHR11985 - PTHR11985 (Panther link)

    Proteins where this domain is known:
    PFC0275w   


    PTHR11986 - Aminotrans_3 (Panther link)

    Interpro entry IPR005814 : Aminotransferase class-III (Interpro link)

    Interpro description:

    Aminotransferases share certain mechanistic features with other pyridoxalphosphate-dependent enzymes, such as the covalent binding of the pyridoxalphosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies. One of these, called class-III, includes acetylornithine aminotransferase, which catalyzes the transfer of an amino group from acetylornithine to alpha-ketoglutarate, yielding N-acetyl-glutamic-5-semi-aldehyde and glutamic acid; ornithine aminotransferase, which catalyzes the transfer of an amino group from ornithine to alpha-ketoglutarate, yielding glutamic-5-semi-aldehyde and glutamic acid; omega-amino acid--pyruvate aminotransferase, which catalyzes transamination between a variety of omega-amino acids, mono- and diamines, and pyruvate; 4-aminobutyrate aminotransferase (GABA transaminase), which catalyzes the transfer of an amino group from GABA to alpha-ketoglutarate, yielding succinate semialdehyde and glutamic acid; DAPA aminotransferase, a bacterial enzyme (bioA), which catalyzes an intermediate step in the biosynthesis of biotin, the transamination of 7-keto-8-aminopelargonic acid to form 7,8-diaminopelargonic acid; 2,2-dialkylglycine decarboxylase, a Burkholderia cepacia (Pseudomonas cepacia) enzyme (dgdA) that catalyzes the decarboxylating amino transfer of 2,2-dialkylglycine and pyruvate to dialkyl ketone, alanine and carbon dioxide; glutamate-1-semialdehyde aminotransferase (GSA); Bacillus subtilis aminotransferases yhxA and yodT; Haemophilus influenzae aminotransferase HI0949; and Caenorhabditis elegans aminotransferase T01B11.2.

    Proteins where this domain is known:
    PFF0435w   


    PTHR11986:SF18 - Orn_aminotrans (Panther link)

    Interpro entry IPR010164 : Ornithine aminotransferase (Interpro link)

    Interpro description:

    Ornithine aminotransferase catalyses the conversion of L-ornithine and a 2-oxo acid to L-glutamate 5-semialdehyde and an L-amino acid. This enzyme is found in low-GC bacteria, where it is responsible for the fourth step in arginine biosynthesis, and in the mitochondrial matrix of eukaryotes, where it controls L-ornithine levels in tissues. In human hereditary ornithine aminotransferase deficiency, the elevated levels of intraocular concentrations of ornithine are responsible for gyrate atrophy, which affects the CNS and peripheral nervous system

    Proteins where this domain is known:
    PFF0435w   


    PTHR11991 - TCTP (Panther link)

    Interpro entry IPR001983 : Translationally controlled tumour-associated TCTP (Interpro link)

    Interpro description:

    Mammalian translationally controlled tumour protein (TCTP) (or P23) is a protein which has been found to be preferentially synthesised in cells during the early growth phase of some types of tumour, but which is also expressed in normal cells. The physiological function of TCTP is still not known. It was first identified as a histamine-releasing factor, acting in IgE +-dependent allergic reactions. In addition, TCTP has been shown to bind to tubulin in the cytoskeleton, has a high affinity for calcium, is the binding target for the antimalarial compound artemisinin, and is induced in vitamin D-dependent apoptosis. TCTP production is thought to be controlled at the translational as well as the transcriptional level.

    TCTP is a hydrophilic protein of 18 to 20 Kd. TCTPs do not share significant sequence similarity with any other class of proteins. Recently, the structure of TCTP was determined and exhibited significant structural similarity to the human protein Mss4, which is a guanine nucleotide-free chaperone of the Rab protein. Close homologues have been found in plants, earthworm, Caenorhabditis elegans (F52H2.11), Hydra, Saccharomyces cerevisiae (YKL056c) and Schizosaccharomyces pombe (SpAC1F12.02c).

    Proteins where this domain is known:
    PFE0545c   


    PTHR11994 - Ribosomal_L5 (Panther link)

    Interpro entry IPR002132 : Ribosomal protein L5 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    L5 is a protein of about 180 amino-acid residues.

    Proteins where this domain is known:
    PF07_0079   


    PTHR11994:SF2 - PTHR11994:SF2 (Panther link)

    Proteins where this domain is known:
    PF07_0079   


    PTHR11998 - Clathrn_med (Panther link)

    Proteins where this domain is known:
    PF11_0202    PF13_0062    PF14_0386    PFL0885w   


    PTHR11998:SF11 - PTHR11998:SF11 (Panther link)

    Proteins where this domain is known:
    PF13_0062   


    PTHR11998:SF13 - PTHR11998:SF13 (Panther link)

    Proteins where this domain is known:
    PF11_0202   


    PTHR11998:SF4 - PTHR11998:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0386   


    PTHR11998:SF9 - PTHR11998:SF9 (Panther link)

    Proteins where this domain is known:
    PFL0885w   


    PTHR12000 - Peptidase_C13 (Panther link)

    Interpro entry IPR001096 : Peptidase C13, legumain (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C13 (legumain family, clan CD). A type example is legumain from Canavalia ensiformis (Jack bean, Horse bean). The blood fluke parasite Schistosoma mansoni has two cysteine proteases in its digestive tract, one a cathepsin B-like protease, the other termed hemoglobinase. The latter has been hard to purify, free of cathepsin B, and expressed forms in Escherichia coli prove to be inactive, suggesting that hemoglobinase may act in association with cathepsin B. Plant vacuolar processing enzyme and legumain from legumes have been shown to have sequence and functional similarity to hemoglobinase. The catalytic residues of the family are currently unknown, but sequence alignments reveal one totally conserved cysteine and two totally conserved histidines.

    Proteins where this domain is known:
    PF11_0298   


    PTHR12000:SF1 - GPI-ANCHOR TRANSAMIDASE (Panther link)

    Proteins where this domain is known:
    PF11_0298   


    PTHR12001 - Polyprenyl_synt (Panther link)

    Interpro entry IPR017446 : (Interpro link)

    Interpro description:
    A variety of isoprenoid compounds are synthesized by various organisms. For example in eukaryotes the isoprenoid biosynthetic pathway is responsible for the synthesis of a variety of end products including cholesterol, dolichol, ubiquinone or coenzyme Q. In bacteria this pathway leads to the synthesis of isopentenyl tRNA, isoprenoid quinones, and sugar carrier lipids. Among the enzymes that participate in that pathway, are a number of polyprenyl synthetase enzymes which catalyze a 1'4-condensation between 5 carbon isoprene units. It has been shown that all the above enzymes share some regions of sequence similarity. Two of these regions are rich in aspartic-acid residues and could be involved in the catalytic mechanism and/or the binding of the substrates.

    Proteins where this domain is known:
    PFB0130w   


    PTHR12001:SF17 - PTHR12001:SF17 (Panther link)

    Proteins where this domain is known:
    PFB0130w   


    PTHR12010 - FAMILY NOT NAMED (Panther link)

    Proteins where this domain is known:
    MAL7P1.300   


    PTHR12010:SF1 - SUBFAMILY NOT NAMED (Panther link)

    Proteins where this domain is known:
    MAL7P1.300   


    PTHR12011 - G-PROTEIN COUPLED RECEPTOR (Panther link)

    Proteins where this domain is known:
    PF13_0201   


    PTHR12011:SF6 - gb def: MKIAA0550 protein (Fragment) (Panther link)

    Proteins where this domain is known:
    PF13_0201   


    PTHR12013 - SRP14 (Panther link)

    Interpro entry IPR003210 : Signal recognition particle, SRP14 subunit (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents the 14 kDa SRP14 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.

    Proteins where this domain is known:
    PFL0160w   


    PTHR12022 - UCR_14kDa (Panther link)

    Interpro entry IPR003197 : Cytochrome bd ubiquinol oxidase, 14 kDa subunit (Interpro link)

    Interpro description:

    The cytochrome bd type terminal oxidases catalyse quinol dependent, Na+ independent oxygen uptake. Members of this family are integral membrane proteins and contain a protoheame IX centre B558.

    Cytochrome bd may play an important role in microaerobic nitrogen fixation in the enteric bacterium Klebsiella pneumoniae, where it is expressed under all conditions that permit diazotrophy.

    The 14 kDa (or VI) subunit of the complex is not directly involved in electron transfer, but has a role in assembly of the complex.

    Proteins where this domain is known:
    PF10_0120   


    PTHR12029 - PTHR12029 (Panther link)

    Proteins where this domain is known:
    PF10_0300    PF14_0273    PFE1275c   


    PTHR12029:SF11 - PTHR12029:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0273   


    PTHR12029:SF7 - PTHR12029:SF7 (Panther link)

    Proteins where this domain is known:
    PF10_0300   


    PTHR12029:SF8 - PTHR12029:SF8 (Panther link)

    Proteins where this domain is known:
    PFE1275c   


    PTHR12039 - NAMN_adtrnsfrase (Panther link)

    Interpro entry IPR005248 : Probable nicotinate-nucleotide adenylyltransferase (Interpro link)

    Interpro description:

    This family contains the predominant bacterial/eukaryotic adenylyltransferases for nicotinamide-nucleotide and for the deamido form, nicotinate nucleotide. Nicotinamide-nucleotide adenylyltransferase synthesizes NAD by the salvage pathway, while nicotinate-nucleotide adenylyltransferase synthesizes the immediate precursor of NAD by the de novo pathway.

    Proteins where this domain is known:
    PF13_0159   


    PTHR12040 - Anti-silence (Panther link)

    Interpro entry IPR006818 : Histone chaperone, ASF1-like (Interpro link)

    Interpro description:

    This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.

    Proteins where this domain is known:
    PFL1180w   


    PTHR12045 - Allantoicase (Panther link)

    Interpro entry IPR005164 : Allantoicase (Interpro link)

    Interpro description:

    Allantoicase (also known as allantoate amidinohydrolase) is involved in purine degradation, facilitating the utilization of purines as secondary nitrogen sources under nitrogen-limiting conditions. While purine degradation converges to uric acid in all vertebrates, its further degradation varies from species to species. Uric acid is excreted by birds, reptiles, and some mammals that do not have a functional uricase gene, whereas other mammals produce allantoin. Amphibians and microorganisms produce ammonia and carbon dioxide using the uricolytic pathway. Allantoicase performs the second step in this pathway catalyzing the conversion of allantoate into ureidoglycolate and urea.

     allantoate + H(2)0 =  (S)-ureidoglycolate + urea

    The structure of allantoicase is best described as being composed of two repeats (the allantoicase repeats: AR1 and AR2), which are connected by a flexible linker. The crystal structure, resolved at 2.4A resolution, reveals that AR1 has a very similar fold to AR2, both repeats being jelly-roll motifs, composed of four-stranded and five-stranded antiparallel beta-sheets. Each jelly-roll motif has two conserved surface patches that probably constitute the active site.

    Proteins where this domain is known:
    PF07_0120    PF14_0384   


    PTHR12046 - PTHR12046 (Panther link)

    Proteins where this domain is known:
    PFD0795w   


    PTHR12052 - mRNA_splic_U5 (Panther link)

    Interpro entry IPR004123 : mRNA splicing factor, thioredoxin-like U5 snRNP (Interpro link)

    Interpro description:

    Thioredoxins are small disulphide-containing redox proteins that have been found in all the kingdoms of living organisms. Thioredoxin serves as a general protein disulphide oxidoreductase. It interacts with a broad range of proteins by a redox mechanism based on reversible oxidation of 2 cysteine thiol groups to a disulphide, accompanied by the transfer of 2 electrons and 2 protons. The net result is the covalent interconversion of a disulphide and a dithiol.

    Compared to human thioredoxin, human U5 snRNP-specific protein U5-15kD contains 37 additional residues that may cause structural changes which most likely form putative binding sites for other spliceosomal proteins or RNA. Although U5-15kD apparently lacks protein disulphide isomerase activity, it is strictly required for pre-mRNA splicing.

    Proteins where this domain is known:
    PFL1520w   


    PTHR12056 - PTHR12056 (Panther link)

    Proteins where this domain is known:
    MAL13P1.213   


    PTHR12064 - PTHR12064 (Panther link)

    Proteins where this domain is known:
    PFI1560c   


    PTHR12072 - PTHR12072 (Panther link)

    Proteins where this domain is known:
    PFL0980w   


    PTHR12083 - PNK_3Pase (Panther link)

    Interpro entry IPR015636 : (Interpro link)

    Interpro description:

    Many eukaryotes possess polynucleotide kinase phosphatase (PNKP), a bifunctional enzyme with 5'-kinase and 3'-phosphatase activities provided by two non-overlapping catalytic domains. These proteins catalyse the dephosphorylation of DNA 3'-phosphates. It is believed that this activity is important for the repair of single-strand breaks in DNA caused by radiation or oxidative damage. Mammalian polynucleotide kinase phosphatase (PNKP) is a key component of both the base excision repair (BER) and nonhomologous end-joining (NHEJ) DNA repair pathways. PNKP creates 5'-phosphate/3'-hydroxyl termini, which are a necessary prerequisite for ligation during repair. PNKP is recruited to repair complexes through interactions between its N-terminal FHA domain and phosphorylated components of either pathway.

    Synonym(s): PNKP,PNK

    Proteins where this domain is known:
    PF13_0334   


    PTHR12087 - PTHR12087 (Panther link)

    Proteins where this domain is known:
    PF13_0189   


    PTHR12096 - SKIP_SNW (Panther link)

    Interpro entry IPR017862 : SKI-interacting protein, SKIP (Interpro link)

    Interpro description:

    SKIP (SKI-interacting protein) is an essential spliceosomal component and transcriptional coregulator, which may provide regulatory coupling of transcription initiation and splicing. SKIP was identified in a yeast 2-hybrid screen, where it was shown to interact with both the cellular and viral forms of SKI through the highly conserved region on SKIP known as the SNW domain. SKIP is now known to interact with a number of other proteins as well. SKIP potentiates the activity of important transcription factors, such as vitamin D receptor, CBF1 (RBP-Jkappa), Smad2/3, and MyoD. It works with Ski in overcoming pRb-mediated cell cycle arrest, and it is targeted by the viral transactivators EBNA2 and E7.

    Proteins where this domain is known:
    PFB0875c   


    PTHR12097 - PTHR12097 (Panther link)

    Proteins where this domain is known:
    PFC0375c   


    PTHR12106 - PTHR12106 (Panther link)

    Proteins where this domain is known:
    PF14_0493   


    PTHR12106:SF7 - PTHR12106:SF7 (Panther link)

    Proteins where this domain is known:
    PF14_0493   


    PTHR12111 - DUF572 (Panther link)

    Interpro entry IPR007590 : (Interpro link)

    Interpro description:
    This is a family of eukaryotic proteins with undetermined function.

    Proteins where this domain is known:
    PF10_0148    PFL1560c   


    PTHR12112 - PTHR12112 (Panther link)

    Proteins where this domain is known:
    PF10_0071   


    PTHR12121 - PTHR12121 (Panther link)

    Proteins where this domain is known:
    PF13_0336    PFA0350w    PFC0850c    PFE0980c    PFL1210w   


    PTHR12121:SF1 - PTHR12121:SF1 (Panther link)

    Proteins where this domain is known:
    PFL1210w   


    PTHR12121:SF4 - PTHR12121:SF4 (Panther link)

    Proteins where this domain is known:
    PFA0350w   


    PTHR12121:SF6 - PTHR12121:SF6 (Panther link)

    Proteins where this domain is known:
    PFC0850c   


    PTHR12121:SF8 - PTHR12121:SF8 (Panther link)

    Proteins where this domain is known:
    PF13_0336   


    PTHR12124 - PTHR12124 (Panther link)

    Proteins where this domain is known:
    MAL13P1.311    MAL8P1.35    PF14_0473    PFB0215c   


    PTHR12131 - PTHR12131 (Panther link)

    Proteins where this domain is known:
    PFF1140c   


    PTHR12133 - PTHR12133 (Panther link)

    Proteins where this domain is known:
    PF13_0087   


    PTHR12135 - Rad4 (Panther link)

    Interpro entry IPR004583 : DNA repair protein Rad4 (Interpro link)

    Interpro description:

    Mutations in the nucleotide excision repair (NER) pathway can cause the xeroderma pigmentosum skin cancer predisposition syndrome. NER lesions are limited to one DNA strand, but otherwise they are chemically and structurally diverse, being caused by a wide variety of genotoxic chemicals and ultraviolet radiation. The xeroderma pigmentosum C (XPC) protein has a central role in initiating global-genome NER by recognizing the lesion and recruiting downstream factors.

    In NER in eukaryotes, DNA is incised on both sides of the lesion, resulting in the removal of a fragment ~25-30 nucleotides long. This is followed by repair synthesis and ligation. This reaction, in yeast, requires the damage binding factors Rad14, RPA, and the Rad4-Rad23 complex, the transcription factor TFIIH which contains the two DNA helicases Rad3 and Rad25, essential for creating a bubble structure, and the two endonucleases, the Rad1-Rad10 complex and Rad2, which incise the damaged DNA strand on the 5'- and 3'-side of the lesion, respectively.

    The crystal structure of the yeast XPC orthologue Rad4 bound to DNA containing a cyclobutane pyrimidine dimer lesion has been determined. The structure shows that Rad4 inserts a beta-hairpin through the DNA duplex, causing the two damaged base pairs to flip out of the double helix. The expelled nucleotides of the undamaged strand are recognized by Rad4, whereas the two cyclobutane pyrimidine dimer-linked nucleotides become disordered. This indicates that the lesions recognised by Rad4/XPC thermodynamically destabilize the double helix in a manner that facilitates the flipping-out of two base pairs.

    Homologues of all the above mentioned yeast genes, except for RAD7, RAD16, and MMS19, have been identified in humans, and mutations in these human genes affect NER in a similar fashion as they do in yeast, with the exception of XPC, the human counterpart of yeast RAD4. Deletion of RAD4 causes the same high level of UV sensitivity as do mutations in the other class 1 genes, and rad4 mutants are completely defective in incision. By contrast, XPC is required for the repair of nontranscribed regions of the genome but not for the repair of the transcribed DNA strand.

    Proteins where this domain is known:
    PF14_0308   


    PTHR12146 - PTHR12146 (Panther link)

    Proteins where this domain is known:
    PF07_0080   


    PTHR12150 - DUF171 (Panther link)

    Interpro entry IPR003750 : (Interpro link)

    Interpro description:

    This entry describes proteins of unknown function.

    Proteins where this domain is known:
    PF14_0307   


    PTHR12151 - SCO1_SenC (Panther link)

    Interpro entry IPR003782 : (Interpro link)

    Interpro description:

    This family is involved in biogenesis of respiratory and photosynthetic systems. In yeast the SCO1 protein is specifically required for a post-translational step in the accumulation of subunits 1 and 2 of cytochrome c oxidase (COXI and COX-II). It is a mitochondrion-associated cytochrome c oxidase assembly factor.

    The purple nonsulphur photosynthetic eubacterium Rhodobacter capsulatus is a versatile organism that can obtain cellular energy by several means, including the capture of light energy for photosynthesis as well as the use of light-independent respiration, in which molecular oxygen serves as a terminal electron acceptor. The SenC protein is required for optimal cytochrome c oxidase activity in aerobically grown R. capsulatus cells and is involved in the induction of structural polypeptides of the light-harvesting and reaction centre complexes.

    Proteins where this domain is known:
    PF07_0034   


    PTHR12154 - Oligosacch_biosynth_Alg14 (Panther link)

    Interpro entry IPR013969 : (Interpro link)

    Interpro description:

    Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane.

    Proteins where this domain is known:
    PFB0515w   


    PTHR12169 - AFG1_ATPase (Panther link)

    Interpro entry IPR005654 : ATPase, AFG1-like (Interpro link)

    Interpro description:

    ATPase family gene 1 (AFG1) ATPase is a 377 amino acid putative protein with an ATPase motif typical of the protein family including SEC18p PAS1, CDC48-VCP and TBP. AFG1 also has substantial homology to these proteins outside the ATPase domain. This family of proteins contains a P-loop motif.

    Proteins where this domain is known:
    PFE1090w   


    PTHR12170 - PTHR12170 (Panther link)

    Proteins where this domain is known:
    PF13_0164   


    PTHR12170:SF2 - PTHR12170:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0164   


    PTHR12174 - Peptidase_A22B (Panther link)

    Interpro entry IPR007369 : Peptidase A22B, signal peptide peptidase (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    This group of sequences contain aspartic endopeptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD): subfamily A22B.

    The peptidases were originally classified by hierarchical homology to the most conserved member - IMPAS 1. They are also known as signal peptide peptidase (SPP). They belong to the I-CliP family of peptidases. SPP cleaves cleaves remnant signal peptides left behind in the membrane by the action of signal peptidase and also plays key roles in immune surveillance and the maturation of certain viral proteins . SPPs do not require cofactors as demonstrated by expression in bacteria and purification of a proteolytically active form. The C-terminal region defines the functional domain, which is in itself sufficient for proteolytic activity.

    Proteins where this domain is known:
    PF14_0543   


    PTHR12174:SF23 - PTHR12174:SF23 (Panther link)

    Proteins where this domain is known:
    PF14_0543   


    PTHR12175 - PTHR12175 (Panther link)

    Proteins where this domain is known:
    MAL13P1.159   


    PTHR12176 - PTHR12176 (Panther link)

    Proteins where this domain is known:
    PF14_0526    PFD0460c   


    PTHR12181 - PTHR12181 (Panther link)

    Proteins where this domain is known:
    PF10_0265    PFC0150w   


    PTHR12181:SF3 - PTHR12181:SF3 (Panther link)

    Proteins where this domain is known:
    PFC0150w   


    PTHR12181:SF6 - PTHR12181:SF6 (Panther link)

    Proteins where this domain is known:
    PF10_0265   


    PTHR12189 - PTHR12189 (Panther link)

    Proteins where this domain is known:
    PF07_0020   


    PTHR12196 - PTHR12196 (Panther link)

    Proteins where this domain is known:
    PFL1080c   


    PTHR12197 - PTHR12197 (Panther link)

    Proteins where this domain is known:
    PF11_0160    PF13_0293    PFF0105w    PFI0485c   


    PTHR12197:SF12 - PTHR12197:SF12 (Panther link)

    Proteins where this domain is known:
    PF13_0293   


    PTHR12197:SF15 - PTHR12197:SF15 (Panther link)

    Proteins where this domain is known:
    PFF0105w   


    PTHR12197:SF6 - PTHR12197:SF6 (Panther link)

    Proteins where this domain is known:
    PFI0485c   


    PTHR12197:SF8 - PTHR12197:SF8 (Panther link)

    Proteins where this domain is known:
    PF11_0160   


    PTHR12202 - PTHR12202 (Panther link)

    Proteins where this domain is known:
    PFL0355c   


    PTHR12209 - PTHR12209 (Panther link)

    Proteins where this domain is known:
    MAL7P1.26   


    PTHR12210 - PTHR12210 (Panther link)

    Proteins where this domain is known:
    PF07_0110    PFE0795c   


    PTHR12210:SF2 - PTHR12210:SF2 (Panther link)

    Proteins where this domain is known:
    PF07_0110   


    PTHR12210:SF4 - PTHR12210:SF4 (Panther link)

    Proteins where this domain is known:
    PFE0795c   


    PTHR12217 - PTHR12217 (Panther link)

    Proteins where this domain is known:
    PFI0365w   


    PTHR12217:SF1 - PTHR12217:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0365w   


    PTHR12220 - Ribosomal_L16 (Panther link)

    Interpro entry IPR000114 : Ribosomal protein L16 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L16 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L16 is known to bind directly the 23S rRNA and to be located at the A site of the peptidyltransferase centre. L16 is a protein of 133 to 185 amino-acid residues.

    Proteins where this domain is known:
    PF14_0041   


    PTHR12220:SF11 - PTHR12220:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0041   


    PTHR12221 - PTHR12221 (Panther link)

    Proteins where this domain is known:
    PF11_0090   


    PTHR12225 - ARM_1 (Panther link)

    Interpro entry IPR006773 : Adhesion regulating molecule (Interpro link)

    Interpro description:

    This is a family of eukaryotic proteins, many of which are believed to be involved in cell adhesion. Members are involved in gastrulation and also in metastatis formation and the progression of cancer. Experimental evidence suggests that these proteins are transmembrane and possibly glycoproteins.

    Proteins where this domain is known:
    PF14_0138   


    PTHR12226 - PTHR12226 (Panther link)

    Proteins where this domain is known:
    PF11_0361   


    PTHR12233 - PTHR12233 (Panther link)

    Proteins where this domain is known:
    PFL2415w   


    PTHR12233:SF1 - Vps26 (Panther link)

    Interpro entry IPR005377 : Vacuolar protein sorting-associated protein 26 (Interpro link)

    Interpro description:

    The movement of lipid and protein components between intracellular organelles requires the regulated interactions of many molecules. Vacuolar protein sorting-associated protein (Vps)5 is a yeast protein that is a subunit of a large multimeric complex, termed the retromer complex, involved in retrograde transport of proteins from endosomes to the trans-Golgi network. Sorting nexin (SNX) 1 and SNX2 are its mammalian orthologs.

    To carry out its biological functions, Vps5 forms the retromer complex with at least four other proteins: Vps17, Vps26, Vps29, and Vps35. This family of Vps26-proteins also contains Down syndrome critical region 3/A.

    Proteins where this domain is known:
    PFL2415w   


    PTHR12239 - PTHR12239 (Panther link)

    Proteins where this domain is known:
    PF10_0343   


    PTHR12241 - Tub_tyr_ligase (Panther link)

    Interpro entry IPR004344 : Tubulin-tyrosine ligase (Interpro link)

    Interpro description:

    Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). Tubulin-tyrosine ligase (TTL) catalyses the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. The true physiological function of TTL has so far not been established. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness.

    3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis.

    Proteins where this domain is known:
    PF10_0094    PF11_0481    PFE0700c   


    PTHR12241:SF11 - TUBULIN TYROSINE LIGASE-RELATED (Panther link)

    Proteins where this domain is known:
    PF10_0094   


    PTHR12241:SF12 - PTHR12241:SF12 (Panther link)

    Proteins where this domain is known:
    PFE0700c   


    PTHR12260 - PTHR12260 (Panther link)

    Proteins where this domain is known:
    PFB0425c   


    PTHR12262 - Rcd1 (Panther link)

    Interpro entry IPR007216 : (Interpro link)

    Interpro description:

    Rcd1 (Required cell differentiation 1) -like proteins are found among a wide range of organisms. Rcd1 was initially identified as an essential factor in nitrogen starvation-invoked differentiation in fission yeast. This results largely from a defect in nitrogen starvation-invoked induction of ste11+, a key transcriptional factor gene required for the onset of sexual development. It is one of the most conserved proteins in eukaryotes, and its mammalian homologue is expressed in a variety of differentiating tissues. The mammalian Rcd1 is a novel transcriptional cofactor and is critical for retinoic acid-induced differentiation of F9 mouse teratocarcinoma cells, at least in part, via forming complexes with retinoic acid receptor and activation transcription factor-2 (ATF-2). Two of the members in this family have been characterised as being involved in regulation of Ste11 regulated sex genes.

    Proteins where this domain is known:
    PFE0375w   


    PTHR12271 - PTHR12271 (Panther link)

    Proteins where this domain is known:
    PF10_0152    PFL1585c   


    PTHR12271:SF14 - PTHR12271:SF14 (Panther link)

    Proteins where this domain is known:
    PF10_0152    PFL1585c   


    PTHR12276 - PTHR12276 (Panther link)

    Proteins where this domain is known:
    PFL2195w   


    PTHR12276:SF8 - PTHR12276:SF8 (Panther link)

    Proteins where this domain is known:
    PFL2195w   


    PTHR12277 - PTHR12277 (Panther link)

    Proteins where this domain is known:
    MAL7P1.156    MAL8P1.138    PF11_0211    PF14_0556    PFD0185c   


    PTHR12277:SF8 - PTHR12277:SF8 (Panther link)

    Proteins where this domain is known:
    MAL7P1.156    MAL8P1.138    PF11_0211    PFD0185c   


    PTHR12277:SF9 - PTHR12277:SF9 (Panther link)

    Proteins where this domain is known:
    PF14_0556   


    PTHR12280 - PTHR12280 (Panther link)

    Proteins where this domain is known:
    PF14_0200    PF14_0354   


    PTHR12280:SF2 - PTHR12280:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0354   


    PTHR12280:SF5 - PTHR12280:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0200   


    PTHR12286 - PTHR12286 (Panther link)

    Proteins where this domain is known:
    PFB0880w   


    PTHR12290 - PTHR12290 (Panther link)

    Proteins where this domain is known:
    PF11_0036    PFD0610w   


    PTHR12290:SF2 - PTHR12290:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0036    PFD0610w   


    PTHR12292 - PTHR12292 (Panther link)

    Proteins where this domain is known:
    MAL8P1.41   


    PTHR12298 - PTHR12298 (Panther link)

    Proteins where this domain is known:
    MAL7P1.76    PFE0515w   


    PTHR12298:SF1 - PTHR12298:SF1 (Panther link)

    Proteins where this domain is known:
    MAL7P1.76    PFE0515w   


    PTHR12300 - TB2_DP1_HVA22 (Panther link)

    Interpro entry IPR004345 : (Interpro link)

    Interpro description:

    This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein which in human is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease.

    The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein (e.g. Q07764), which is thought to be a regulatory protein.

    Proteins where this domain is known:
    MAL13P1.288    PFC0730w   


    PTHR12300:SF13 - PTHR12300:SF13 (Panther link)

    Proteins where this domain is known:
    PFC0730w   


    PTHR12302 - PTHR12302 (Panther link)

    Proteins where this domain is known:
    PF11_0374   


    PTHR12303 - N2227 (Panther link)

    Interpro entry IPR012901 : (Interpro link)

    Interpro description:

    This family features sequences that are similar to a region of hypothetical yeast gene product N2227. This is thought to be expressed during meiosis and may be involved in the defence response to stressful conditions.

    Proteins where this domain is known:
    PFC0390w   


    PTHR12309 - PTHR12309 (Panther link)

    Proteins where this domain is known:
    PFB0450w   


    PTHR12311 - PTHR12311 (Panther link)

    Proteins where this domain is known:
    PF07_0083   


    PTHR12311:SF6 - PTHR12311:SF6 (Panther link)

    Proteins where this domain is known:
    PF07_0083   


    PTHR12313 - PTHR12313 (Panther link)

    Proteins where this domain is known:
    PFF1325c   


    PTHR12315 - PTHR12315 (Panther link)

    Proteins where this domain is known:
    PF11_0468   


    PTHR12320 - PTHR12320 (Panther link)

    Proteins where this domain is known:
    PF07_0019    PF10_0093    PFL0445w   


    PTHR12341 - PTHR12341 (Panther link)

    Proteins where this domain is known:
    PF11_0074    PFI0455w   


    PTHR12356 - PTHR12356 (Panther link)

    Proteins where this domain is known:
    MAL8P1.96    PF13_0204    PFI0990c    PFI1325w   


    PTHR12356:SF1 - PTHR12356:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0990c    PFI1325w   


    PTHR12356:SF2 - PTHR12356:SF2 (Panther link)

    Proteins where this domain is known:
    MAL8P1.96   


    PTHR12356:SF3 - PTHR12356:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0204   


    PTHR12357 - YTH (Panther link)

    Interpro entry IPR007275 : (Interpro link)

    Interpro description:
    This family of poorly characterised proteins containsYT521-B, a putative splicing factor from rat. YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68 kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner.

    Proteins where this domain is known:
    PF14_0193    PFC0410w   


    PTHR12357:SF3 - PTHR12357:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0193   


    PTHR12357:SF4 - PTHR12357:SF4 (Panther link)

    Proteins where this domain is known:
    PFC0410w   


    PTHR12373 - Enh_rudimentary (Panther link)

    Interpro entry IPR000781 : (Interpro link)

    Interpro description:
    The Drosophila protein 'enhancer of rudimentary' (gene (e(r)) is a small protein of 104 residues whose function is not yet clear. From an evolutionary point of view, it is highly conserved and has been found to exist in probably all multicellular eukaryotic organisms. It has been proposed that this protein plays a role in the cell cycle.

    Proteins where this domain is known:
    PF10_0370   


    PTHR12374 - PTHR12374 (Panther link)

    Proteins where this domain is known:
    PF10_0143   


    PTHR12374:SF1 - PTHR12374:SF1 (Panther link)

    Proteins where this domain is known:
    PF10_0143   


    PTHR12375 - LUC7_rel (Panther link)

    Interpro entry IPR004882 : (Interpro link)

    Interpro description:

    This family consists of several LUC7 protein homologues that are restricted to eukaryotes. LUC7 has been shown to be a U1 snRNA associated protein with a role in splice site recognition. The entry contains human and mouse LUC7 like (LUC7L) proteins and human cisplatin resistance-associated overexpressed protein (CROP).

    Proteins where this domain is known:
    PF14_0502   


    PTHR12375:SF6 - PTHR12375:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0502   


    PTHR12377 - PTHR12377 (Panther link)

    Proteins where this domain is known:
    PF07_0011   


    PTHR12378 - PTHR12378 (Panther link)

    Proteins where this domain is known:
    PFI0940c    PFL0865w   


    PTHR12380 - SYNTAXIN (Panther link)

    Proteins where this domain is known:
    PF14_0500    PFE1505w   


    PTHR12380:SF10 - SYNTAXIN 6 (Panther link)

    Proteins where this domain is known:
    PF14_0500   


    PTHR12381 - PTHR12381 (Panther link)

    Proteins where this domain is known:
    PF14_0521    PF14_0522    PF14_0527   


    PTHR12381:SF4 - PTHR12381:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0522   


    PTHR12381:SF5 - PTHR12381:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0527   


    PTHR12381:SF6 - PTHR12381:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0521   


    PTHR12383 - Peptidase_S26A (Panther link)

    Interpro entry IPR014037 : Peptidase S26A (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to MEROPS peptidase family S26 (signal peptidase I family, clan SF), subfamily S26A.

    At least 3 eubacterial leader peptidases are known: murein prelipoprotein peptidase, which cleaves the leader peptide from a component of the bacterial outer membrane; type IV prepilin leader peptidase; and the serine-dependent leader peptidase 1, which has the more general role of cleaving the leader peptide from a variety of secreted proteins and proteins directed to the periplasm and periplasmic membrane. Leader peptidase 1 is similar to the eukaryotic signal peptidase, although the bacterial protein is monomeric, while the eukaryotic protein is multimeric.

    Mitochondria contain a similar two-subunit serine protease that removes leader peptides from nuclear- and mitochondrial-encoded proteins, which localise in the inner mitochondrial space. The catalytic residues of a number of these peptides have been identified as a serine/lysine dyad.

    Proteins where this domain is known:
    PF13_0118   


    PTHR12383:SF2 - PTHR12383:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0118   


    PTHR12387 - Nin1_C (Panther link)

    Interpro entry IPR006746 : 26S proteasome non-ATPase regulatory subunit Rpn12 (Interpro link)

    Interpro description:

    Intracellular proteins, including short-lived proteins such as cyclin, Mos, Myc, p53, NF-kappaB, and IkappaB, are degraded by the ubiquitin-proteasome system. The 26S proteasome is a self-compartmentalising protease responsible for the regulated degradation of intracellular proteins in eukaryotes. This giant intracellular protease is formed by several subunits arranged into two 19S polar caps, where protein recognition and ATP-dependent unfolding occur, flanking a 20S central barrel-shaped structure with an inner proteolytic chamber. This overall structure is highly conserved among eukaryotes and is essential for cell viability. Proteins targeted to the 26S proteasome are conjugated with a polyubiquitin chain by an enzymatic cascade before delivery to the 26S proteasome for degradation into oligopeptides.

    The 19S component is divided into a "base" subunit containing six ATPases (Rpt proteins) and two non-ATPases (Rpn1, Rpn2), and a "lid" subunit composed of eight stoichiometric proteins (Rpn3, Rpn5, Rpn6, Rpn7, Rpn8, Rpn9, Rpn11, Rpn12). Additional non-essential and species specific proteins may also be present. The 19S unit performs several essential functions including binding the specific protein substrates, unfolding them, cleaving the attached ubiquitin chains, opening the 20S subunit, and driving the unfolded polypeptide into the proteolytic chamber for degradation. The 26s proteasome and 19S regulator are of medical interest due to their involvement in burn rehabilitation.

    This entry represents Rpn12 (also often annotated as 26S proteasome non-ATPase regulatory subunit 8). This protein has been shown to be important for the transition from metaphase to anaphase and the activation of Cdc28p kinase in yeast.

    Proteins where this domain is known:
    PFC0520w   


    PTHR12398 - PTHR12398 (Panther link)

    Proteins where this domain is known:
    PFC0886w   


    PTHR12398:SF4 - PTHR12398:SF4 (Panther link)

    Proteins where this domain is known:
    PFC0886w   


    PTHR12399 - EIF-3_zeta (Panther link)

    Interpro entry IPR007783 : Eukaryotic translation initiation factor 3, subunit 7 (Interpro link)

    Interpro description:
    This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals.

    Proteins where this domain is known:
    PF10_0077   


    PTHR12400 - IPK (Panther link)

    Interpro entry IPR005522 : Inositol polyphosphate kinase (Interpro link)

    Interpro description:

    ArgRIII has been demonstrated to be an inositol polyphosphate kinase which catalyses the reaction

    ATP + 1D-myo-inositol 1,4,5-trisphosphate = ADP + 1D-myo-inositol 1,3,4,5-tetrakisphosphate
    .

    Proteins where this domain is known:
    PF13_0089    PFE0740c   


    PTHR12400:SF2 - PTHR12400:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0740c   


    PTHR12400:SF21 - PTHR12400:SF21 (Panther link)

    Proteins where this domain is known:
    PF13_0089   


    PTHR12403 - Sedlin (Panther link)

    Interpro entry IPR006722 : Sedlin (Interpro link)

    Interpro description:

    Sedlin is a 140 amino-acid protein with a putative role in endoplasmic reticulum-to-Golgi transport. Several missense mutations and deletion mutations in the SEDL gene, which result in protein truncation by frame shift, are responsible for spondyloepiphyseal dysplasia tarda, a progressive skeletal disorder (OMIM:313400). .

    Proteins where this domain is known:
    PF13_0174   


    PTHR12403:SF1 - PTHR12403:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0174   


    PTHR12406 - PTHR12406 (Panther link)

    Proteins where this domain is known:
    PFB0870w   


    PTHR12409 - PTHR12409 (Panther link)

    Proteins where this domain is known:
    MAL7P1.94   


    PTHR12411 - Peptidase_C1A (Panther link)

    Interpro entry IPR013128 : Peptidase C1A, papain (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to MEROPS peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.

    The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.

    The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.

    Proteins where this domain is known:
    PF11_0161    PF11_0162    PF11_0165    PF11_0174    PF14_0553    PFB0325c    PFB0330c    PFB0335c    PFB0340c    PFB0345c    PFB0350c    PFB0355c    PFB0360c    PFD0230c    PFI0135c    PFL2290w   


    PTHR12411:SF17 - PTHR12411:SF17 (Panther link)

    Proteins where this domain is known:
    PF11_0174    PFL2290w   


    PTHR12411:SF25 - PTHR12411:SF25 (Panther link)

    Proteins where this domain is known:
    PF14_0553   


    PTHR12411:SF35 - gb def: Berghepain-2 (Panther link)

    Proteins where this domain is known:
    PF11_0161    PF11_0162    PF11_0165   


    PTHR12411:SF8 - PTHR12411:SF8 (Panther link)

    Proteins where this domain is known:
    PFB0325c    PFB0330c    PFB0335c    PFB0340c    PFB0345c    PFB0350c    PFB0355c    PFB0360c    PFI0135c   


    PTHR12412 - CAP BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    MAL13P1.352   


    PTHR12412:SF2 - CAP BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    MAL13P1.352   


    PTHR12416 - DUF652 (Panther link)

    Interpro entry IPR006984 : (Interpro link)

    Interpro description:
    This family is comprises of uncharacterised eukaryotic proteins.

    Proteins where this domain is known:
    MAL8P1.67    PFL1220w   


    PTHR12419 - PTHR12419 (Panther link)

    Proteins where this domain is known:
    PF10_0308    PFI1135c   


    PTHR12428 - Innermemb_insert (Panther link)

    Interpro entry IPR001708 : 60 kDa inner membrane insertion protein (Interpro link)

    Interpro description:

    This family of proteins is required for the insertion of integral membrane proteins into cellular membranes. Many of these integral membrane proteins are associated with respiratory chain complexes, for example a large number of members of this family play an essential role in the activity and assembly of cytochrome c oxidase.

    Stage III sporulation protein J (SP3J) is a probable lipoprotein, rich in basic and hydrophobic amino acids. Mutations in the protein abolish the transcription of prespore-specific genes transcribed by the sigma G form of RNA polymerase. SP3J could be involved in a signal transduction pathway coupling gene expression in the prespore to events in the mother cell, or it may be necessary for essential metabolic interactions between the two cells. The protein shows a high degree of similarity to Bacillus subtilis YQJG, to yeast OXA1 and also to bacterial 60 kDa inner-membrane proteins.

    Proteins where this domain is known:
    MAL8P1.14   


    PTHR12428:SF3 - PTHR12428:SF3 (Panther link)

    Proteins where this domain is known:
    MAL8P1.14   


    PTHR12438 - PTHR12438 (Panther link)

    Proteins where this domain is known:
    MAL8P1.157   


    PTHR12442 - PTHR12442 (Panther link)

    Proteins where this domain is known:
    PF10_0196    PF10_0371    PF14_0243    PFI1080w    PFL0610w   


    PTHR12442:SF10 - PTHR12442:SF10 (Panther link)

    Proteins where this domain is known:
    PFI1080w   


    PTHR12442:SF12 - PTHR12442:SF12 (Panther link)

    Proteins where this domain is known:
    PFL0610w   


    PTHR12442:SF22 - PTHR12442:SF22 (Panther link)

    Proteins where this domain is known:
    PF10_0196   


    PTHR12442:SF7 - PTHR12442:SF7 (Panther link)

    Proteins where this domain is known:
    PF14_0243   


    PTHR12443 - Sec62 (Panther link)

    Interpro entry IPR004728 : Translocation protein Sec62 (Interpro link)

    Interpro description:
    Members of the NSCC2 family have been sequenced from various yeast, fungal and animals species including Saccharomyces cerevisiae, Drosophila melanogaster and Homo sapiens. These proteins are the Sec62 proteins, believed to be associated with the Sec61 and Sec63 constituents of the general protein secretary systems of yeast microsomes. They are also the non-selective cation (NS) channels of the mammalian cytoplasmic membrane. The yeast Sec62 protein has been shown to be essential for cell growth. The mammalian NS channel proteins have been implicated in platelet derived growth factor(PGDF) dependent single channel current in fibroblasts. These channels are essentially closed in serum deprived tissue-culture cells and are specifically opened by exposure to PDGF. These channels are reported to exhibit equal selectivity for Na+, K+ and Cs+ with low permeability to Ca2+, and no permeability to anions.

    Proteins where this domain is known:
    PF14_0361   


    PTHR12443:SF2 - PTHR12443:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0361   


    PTHR12448 - PTHR12448 (Panther link)

    Proteins where this domain is known:
    MAL7P1.75   


    PTHR12458 - DUF667 (Panther link)

    Interpro entry IPR007714 : (Interpro link)

    Interpro description:
    This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature.

    Proteins where this domain is known:
    PFE0415w   


    PTHR12466 - Cdc73 (Panther link)

    Interpro entry IPR007852 : (Interpro link)

    Interpro description:

    Paf1 is an RNA polymerase II-associated protein in yeast, which defines a complex that is distinct from the Srb/Mediator holoenzyme. The Paf1 complex, which also contains Cdc73, Ctr9, Hpr1, Ccr4, Rtf1 and Leo1, is required for full expression of a subset of yeast genes, particularly those responsive to signals from the Pkc1/MAP kinase cascade. The complex appears to play an essential role in RNA elongation.

    Proteins where this domain is known:
    PFE1040c   


    PTHR12466:SF4 - PTHR12466:SF4 (Panther link)

    Proteins where this domain is known:
    PFE1040c   


    PTHR12468 - Mannosyl_trans2 (Panther link)

    Interpro entry IPR007315 : Mannosyltransferase, PIG-V (Interpro link)

    Interpro description:

    This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI.

    Proteins where this domain is known:
    PFL2270w   


    PTHR12468:SF8 - PTHR12468:SF8 (Panther link)

    Proteins where this domain is known:
    PFL2270w   


    PTHR12472 - PTHR12472 (Panther link)

    Proteins where this domain is known:
    PF10_0158   


    PTHR12477 - PTHR12477 (Panther link)

    Proteins where this domain is known:
    PF14_0215   


    PTHR12477:SF3 - PTHR12477:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0215   


    PTHR12483 - Cop_transporter (Panther link)

    Interpro entry IPR007274 : Ctr copper transporter (Interpro link)

    Interpro description:

    The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionines probably coordinate copper during the process of metal transport.

    Proteins where this domain is known:
    PF14_0211    PF14_0369   


    PTHR12499 - OPA3-like (Panther link)

    Interpro entry IPR010754 : (Interpro link)

    Interpro description:

    OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity.

    This family consists of several optic atrophy 3 (OPA3) proteins and related proteins from other eukaryotic species, the function is unknown.

    Proteins where this domain is known:
    PF14_0566   


    PTHR12509 - DUF1042 (Panther link)

    Interpro entry IPR010441 : (Interpro link)

    Interpro description:

    This is a family of proteins of unknown function.

    Proteins where this domain is known:
    PF14_0595    PFI1450c   


    PTHR12509:SF8 - SPERMATOGENESIS-ASSOCIATED 4 (Panther link)

    Proteins where this domain is known:
    PF14_0595   


    PTHR12509:SF9 - PTHR12509:SF9 (Panther link)

    Proteins where this domain is known:
    PFI1450c   


    PTHR12526 - PTHR12526 (Panther link)

    Proteins where this domain is known:
    PF10_0316   


    PTHR12526:SF37 - PTHR12526:SF37 (Panther link)

    Proteins where this domain is known:
    PF10_0316   


    PTHR12537 - RNA BINDING PROTEIN PUMILIO-RELATED (Panther link)

    Proteins where this domain is known:
    PFD0825c    PFE0935c   


    PTHR12537:SF12 - PUMILIO 1, 2 (Panther link)

    Proteins where this domain is known:
    PFD0825c   


    PTHR12537:SF13 - PUMILIO-RELATED (Panther link)

    Proteins where this domain is known:
    PFE0935c   


    PTHR12538 - Ribosomal_S26E (Panther link)

    Interpro entry IPR000892 : Ribosomal protein S26e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic ribosomal proteins can be grouped on the basis of sequence similarities. One of these families, the S26E family, includes mammalian S26; Octopus S26; Drosophila S26 (DS31); plant cytoplasmic S26; and fungal S26. These proteins have 114 to 127 amino acids.

    Proteins where this domain is known:
    PFB0830w   


    PTHR12546 - FER-1-LIKE (Panther link)

    Proteins where this domain is known:
    MAL8P1.134    PF14_0530    PFL1010c   


    PTHR12546:SF1 - PTHR12546:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0530    PFL1010c   


    PTHR12546:SF3 - gb def: Hypothetical protein (Panther link)

    Proteins where this domain is known:
    MAL8P1.134   


    PTHR12547 - PTHR12547 (Panther link)

    Proteins where this domain is known:
    PF10_0083   


    PTHR12553 - PTHR12553 (Panther link)

    Proteins where this domain is known:
    PF07_0100    PF14_0620   


    PTHR12553:SF1 - PTHR12553:SF1 (Panther link)

    Proteins where this domain is known:
    PF07_0100   


    PTHR12553:SF5 - PTHR12553:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0620   


    PTHR12555 - UFD1 (Panther link)

    Interpro entry IPR004854 : Ubiquitin fusion degradation protein UFD1 (Interpro link)

    Interpro description:
    Post-translational ubiquitin-protein conjugates are recognized for degradation by the ubiquitin fusion degradation (UFD) pathway. Several proteins involved in this pathway have been identified. This family includes UFD1, a 40kD protein that is essential for vegetative cell viability. The human UFD1 gene is expressed at high levels during embryogenesis, especially in the eyes and in the inner ear primordia and is thought to be important in the determination of ectoderm-derived structures, including neural crest cells. In addition, this gene is deleted in the CATCH-22 (cardiac defects, abnormal facies, thymic hypoplasia, cleft palate and hypocalcaemia with deletions on chromosome 22) syndrome. This clinical syndrome is associated with a variety of developmental defects, all characterised by microdeletions on 22q11.2. Two such developmental defects are the DiGeorge syndrome OMIM:188400, and the velo-cardio- facial syndrome OMIM:145410. Several of the abnormalities associated with these conditions are thought to be due to defective neural crest cell differentiation.

    Proteins where this domain is known:
    PF14_0178    PFE1235c    PFI0810c   


    PTHR12558 - PTHR12558 (Panther link)

    Proteins where this domain is known:
    PFE0085c   


    PTHR12558:SF2 - PTHR12558:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0085c   


    PTHR12560 - PTHR12560 (Panther link)

    Proteins where this domain is known:
    PF14_0034    PFE0405c   


    PTHR12561 - PTHR12561 (Panther link)

    Proteins where this domain is known:
    PF13_0083   


    PTHR12570 - DUF803 (Panther link)

    Interpro entry IPR008521 : (Interpro link)

    Interpro description:
    This family consists of several eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PFE1130w   


    PTHR12581 - PTHR12581 (Panther link)

    Proteins where this domain is known:
    PFB0370c   


    PTHR12586 - PTHR12586 (Panther link)

    Proteins where this domain is known:
    MAL8P1.58   


    PTHR12589 - 6_PTP_synth (Panther link)

    Interpro entry IPR007115 : 6-pyruvoyl tetrahydropterin synthase-related (Interpro link)

    Interpro description:

    The complex organic chemistry involved in the transformation of GTP to tetrahydrobiopterin is catalysed by only three enzymes: GTP cyclohydrolase I, 6-pyruvoyltetrahydropterin synthase and sepiapterin reductase. Tetrahydrobiopterin is the cofactor for several aromatic amino acid monooxygenases and the nitric oxide synthases. 6-Pyruvoyl tetrahydropterin synthase (PTPS) is a Zn-dependent metalloprotein, transforms dihydroneopterin triphosphate into 6-pyruvoyltetrahydropterin in the presence of Mg(II) and for which the crystal structure is known.

    The enzyme is a homohexameric, composed of a dimer of trimers. A transition metal binding site formed by the three histidine residues 23, 48 and 50 is present in each subunit, and bound Zn(II) is responsible for the enzymatic activity. Site-directed mutagenesis of each of these three histidine residues results in a complete loss of metal binding and enzymatic activity.

    The function of the bacterial branch of the sequence lineage appears not to have been established.

    Proteins where this domain is known:
    PFF1360w   


    PTHR12592 - PTHR12592 (Panther link)

    Proteins where this domain is known:
    PF11_0453   


    PTHR12595 - PTHR12595 (Panther link)

    Proteins where this domain is known:
    PFA0530c   


    PTHR12596 - PTHR12596 (Panther link)

    Proteins where this domain is known:
    PFI0490c   


    PTHR12596:SF2 - PTHR12596:SF2 (Panther link)

    Proteins where this domain is known:
    PFI0490c   


    PTHR12599 - Trans_pterinDh (Panther link)

    Interpro entry IPR001533 : Transcriptional coactivator/pterin dehydratase (Interpro link)

    Interpro description:

    DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.

    Proteins where this domain is known:
    PF11_0095a   


    PTHR12603 - PTHR12603 (Panther link)

    Proteins where this domain is known:
    PFL1705w   


    PTHR12606 - PTHR12606 (Panther link)

    Proteins where this domain is known:
    PFL1635w   


    PTHR12612 - PTHR12612 (Panther link)

    Proteins where this domain is known:
    PF14_0122   


    PTHR12613 - ERO1 (Panther link)

    Interpro entry IPR007266 : Endoplasmic reticulum oxidoreductin 1 (Interpro link)

    Interpro description:
    Members of this family are required for the formation of disulphide bonds in the endoplasmic reticulum.

    Proteins where this domain is known:
    PF11_0251   


    PTHR12615 - PSS (Panther link)

    Interpro entry IPR004277 : Phosphatidyl serine synthase (Interpro link)

    Interpro description:
    Phosphatidyl serine synthase is also known as serine exchange enzyme. This family represents eukaryotic PSS I and II, membrane bound proteins that catalyse the replacement of the head group of a phospholipid (phosphotidylcholine or phosphotidylethanolamine) by L-serine.

    Proteins where this domain is known:
    MAL13P1.335   


    PTHR12620 - U2_small (Panther link)

    Interpro entry IPR009145 : U2 auxiliary factor small subunit (Interpro link)

    Interpro description:

    The U2 small nuclear ribonucleoprotein auxiliary factor (U2AF) is a heterodimeric splicing factor composed of a large and a small subunit. The large U2AF subunit recognises the intronic polypyrimidine tract, a sequence located adjacent to the 3' splice site that serves as an important signal for both constitutive and regulated pre-mRNA splicing. The small subunit interacts with the 3' splice site dinucleotide AG and is essential for regulated splicing. The subunits shuttle continuously between the nucleus and the cytoplasm via a mechanism that involves carrier receptors and is independent of binding to mRNA. Both subunits contain an arginine/ serine-rich (RS) domain, which acts as a nuclear localisation signal. Furthermore, the presence of an RS domain on either subunit is sufficient to trigger the nucleocytoplasmic import of the heterodimeric complex.

    The human form of the U2 auxiliary factor small subunit, hU2AF35, contains a degenerate RNA recognition motif (RRM) and a C-terminal RS domain. The murine form has been shown to be genomically imprinted with monoallelic expression from the paternal allele. However, this is not the case in humans.

    Proteins where this domain is known:
    PF11_0200   


    PTHR12626 - PTHR12626 (Panther link)

    Proteins where this domain is known:
    PF14_0546   


    PTHR12636 - Mra1 (Panther link)

    Interpro entry IPR005304 : Ribosomal biogenesis, methyltransferase, EMG1/NEP1 (Interpro link)

    Interpro description:

    Members of this family are essential for 40S ribosomal biogenesis. They play a role in the methylation reaction of pre-rRNA processing. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases.

    Proteins where this domain is known:
    PF08_0041    PFL1230w   


    PTHR12636:SF4 - PTHR12636:SF4 (Panther link)

    Proteins where this domain is known:
    PFL1230w   


    PTHR12638 - Mago_nashi (Panther link)

    Interpro entry IPR004023 : Mago nashi protein (Interpro link)

    Interpro description:
    This family was originally identified in drosophila and called mago nashi, it is a strict maternal effect, grandchildless-like, gene. The human homologue has been shown to interact with an RNA binding protein, ribonucleoprotein rbm8. An RNAi knockout of the Caenorhabditis elegans homologue causes masculinization of the germ line (Mog phenotype) hermaphrodites, suggesting it is involved in hermaphrodite germ-line sex determination but the protein is also found in hermaphrodites and other organisms without a sexual differentiation.

    Proteins where this domain is known:
    MAL7P1.139   


    PTHR12642 - PTHR12642 (Panther link)

    Proteins where this domain is known:
    MAL7P1.24   


    PTHR12645 - PTHR12645 (Panther link)

    Proteins where this domain is known:
    PFA0500w   


    PTHR12649 - PTHR12649 (Panther link)

    Proteins where this domain is known:
    PFD0355c   


    PTHR12650 - Ribosomal_S30 (Panther link)

    Interpro entry IPR006846 : Ribosomal protein S30 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry is for the ribosomal protein S30.

    Proteins where this domain is known:
    PFB0885w   


    PTHR12650:SF4 - PTHR12650:SF4 (Panther link)

    Proteins where this domain is known:
    PFB0885w   


    PTHR12651 - PTHR12651 (Panther link)

    Proteins where this domain is known:
    PFC0785c   


    PTHR12661 - PTHR12661 (Panther link)

    Proteins where this domain is known:
    PF08_0053   


    PTHR12674 - PTHR12674 (Panther link)

    Proteins where this domain is known:
    PF11_0292   


    PTHR12677 - PTHR12677 (Panther link)

    Proteins where this domain is known:
    MAL13P1.329    PFF0540c   


    PTHR12681 - PTHR12681 (Panther link)

    Proteins where this domain is known:
    PFL2150c   


    PTHR12682 - DUF101 (Panther link)

    Interpro entry IPR002804 : (Interpro link)

    Interpro description:

    Proteins in this entry are found in archaea, bacteria and eukaryotes. Their function is unknown, but alignment shows several conserved polar residues which are potential catalytic residues. The structure of one of these proteins has been determined and shows homolgy to heat shock protein 33, which is a chaperone protein that inhibits the aggregation of partially denatured proteins.

    Proteins where this domain is known:
    PF14_0269   


    PTHR12682:SF8 - PTHR12682:SF8 (Panther link)

    Proteins where this domain is known:
    PF14_0269   


    PTHR12683 - MAT1 (Panther link)

    Interpro entry IPR004575 : Cdk-activating kinase assembly factor (MAT1) (Interpro link)

    Interpro description:

    MAT1 (menage a trois 1) is a RING finger protein with a characteristic C3HC4 motif located in the N-terminal domain. MAT1 stabilises the cyclin H-CDK7 complex to form a functional CDK-activating kinase (CAK) enzymatic complex which then goes on to activate many of the CDK enzymes intimately involved in the cell cycle. CDK7 forms a stable complex with cyclin H and MAT1 in vivo only when phosphorylated on either one of two residues (Ser164 or Thr170) in its T-loop. The requirement for MAT1 for the activation of CAK can be by-passed by the phosphorylation of CDK7 on the T-loop. The two mechanisms for CDK7 complex stabilisation and activation (MAT1 addition and T-loop phosphorylation), which can operate independently in vitro, actually cooperate under physiological conditions to maintain complex integrity. With prolonged exposure to elevated temperature, dissociation to monomeric subunits occurs in vivo when CDK7 is dephosphorylated, even in the presence of MAT1.

    The Cyclin H-MAT1-CDK7 complex also forms part of TFIIH, a multiprotein complex required for both transcription and DNA repair.

    Proteins where this domain is known:
    PFE0610c   


    PTHR12686 - PTHR12686 (Panther link)

    Proteins where this domain is known:
    MAL7P1.104   


    PTHR12687 - UPF0120 (Panther link)

    Interpro entry IPR005343 : (Interpro link)

    Interpro description:

    This is a small family of mainly hypothetical proteins of unknown function.

    Proteins where this domain is known:
    PF13_0219   


    PTHR12687:SF1 - PTHR12687:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0219   


    PTHR12688 - DLIC (Panther link)

    Interpro entry IPR008467 : Dynein light intermediate chain (Interpro link)

    Interpro description:
    This family consists of several eukaryotic dynein light intermediate chain proteins. The light intermediate chains (LICs) of cytoplasmic dynein consist of multiple isoforms, which undergo post-translational modification to produce a large number of species. DLIC1 is known to be involved in assembly, organisation, and function of centrosomes and mitotic spindles when bound to pericentrin. DLIC2 is a subunit of cytoplasmic dynein 2 that may play a role in maintaining Golgi organisation by binding cytoplasmic dynein 2 to its Golgi-associated cargo.

    Proteins where this domain is known:
    PFI0315c   


    PTHR12694 - TFIIA (Panther link)

    Interpro entry IPR004855 : Transcription factor IIA, alpha/beta subunit (Interpro link)

    Interpro description:

    Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP, and can dissociate HMGB1 already bound to TBP/TATA-box.

    Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2.

    This entry represents the precursor that yields both the alpha and beta subunits of TFIIA. The TFIIA heterotrimer is an essential general transcription initiation factor for the expression of genes transcribed by RNA polymerase II.

    Proteins where this domain is known:
    MAL7P1.78   


    PTHR12694:SF1 - PTHR12694:SF1 (Panther link)

    Proteins where this domain is known:
    MAL7P1.78   


    PTHR12695 - PTHR12695 (Panther link)

    Proteins where this domain is known:
    MAL13P1.76   


    PTHR12695:SF1 - PTHR12695:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.76   


    PTHR12697 - PTHR12697 (Panther link)

    Proteins where this domain is known:
    PF13_0013   


    PTHR12708 - PTHR12708 (Panther link)

    Proteins where this domain is known:
    PFL1655c   


    PTHR12709 - PTHR12709 (Panther link)

    Proteins where this domain is known:
    PF10_0269    PF11_0058   


    PTHR12709:SF1 - PTHR12709:SF1 (Panther link)

    Proteins where this domain is known:
    PF10_0269   


    PTHR12709:SF2 - PTHR12709:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0058   


    PTHR12710 - PTHR12710 (Panther link)

    Proteins where this domain is known:
    PFE0380c   


    PTHR12713 - V-ATPase_G (Panther link)

    Interpro entry IPR005124 : Vacuolar (H+)-ATPase G subunit (Interpro link)

    Interpro description:
    This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments. Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialised cells. V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation.

    Proteins where this domain is known:
    PF13_0130   


    PTHR12714 - PTHR12714 (Panther link)

    Proteins where this domain is known:
    PFL1780w   


    PTHR12718 - Cwf_Cwc_15 (Panther link)

    Interpro entry IPR006973 : Cwf15/Cwc15 cell cycle control protein (Interpro link)

    Interpro description:
    This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologues. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing.

    Proteins where this domain is known:
    PF07_0091   


    PTHR12722 - XAP5 (Panther link)

    Interpro entry IPR007005 : XAP5 protein (Interpro link)

    Interpro description:
    These proteins are found in a wide range of eukaryotes. Their function is uncertain though they are nuclear proteins, possibly with DNA-binding activity.

    Proteins where this domain is known:
    PFE1530c   


    PTHR12725 - PTHR12725 (Panther link)

    Proteins where this domain is known:
    PF11_0190   


    PTHR12725:SF4 - PTHR12725:SF4 (Panther link)

    Proteins where this domain is known:
    PF11_0190   


    PTHR12728 - PTHR12728 (Panther link)

    Proteins where this domain is known:
    PF10_0278   


    PTHR12729 - Thg1 (Panther link)

    Interpro entry IPR007537 : (Interpro link)

    Interpro description:

    The Thg1 protein from Saccharomyces cerevisiae (Baker's yeast) is responsible for adding a GMP residue to the 5' end of tRNA His.

    Proteins where this domain is known:
    PF07_0095   


    PTHR12730 - PTHR12730 (Panther link)

    Proteins where this domain is known:
    PF07_0067   


    PTHR12732 - PTHR12732 (Panther link)

    Proteins where this domain is known:
    PFB0240w   


    PTHR12734 - PTHR12734 (Panther link)

    Proteins where this domain is known:
    PFE1115c   


    PTHR12735 - BolA (Panther link)

    Interpro entry IPR002634 : (Interpro link)

    Interpro description:
    This family consist of the morpho-protein BolA from Escherichia coli and its various homologs. In E. coli, over-expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5.

    Proteins where this domain is known:
    PFE0790c   


    PTHR12735:SF2 - PTHR12735:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0790c   


    PTHR12736 - PTHR12736 (Panther link)

    Proteins where this domain is known:
    PFE1265w   


    PTHR12736:SF2 - PTHR12736:SF2 (Panther link)

    Proteins where this domain is known:
    PFE1265w   


    PTHR12741 - LYST-INTERACTING PROTEIN LIP5 (DOPAMINE RESPONSIVE PROTEIN DRG-1) (Panther link)

    Proteins where this domain is known:
    PFA0150c   


    PTHR12743 - Cyto_heme_lyase (Panther link)

    Interpro entry IPR000511 : Cytochrome c and c1 haem-lyase (Interpro link)

    Interpro description:
    Cytochrome c haem-lyase (CCHL) and cytochrome Cc1 haem-lyase (CC1HL) are mitochondrial enzymes that catalyse the covalent attachment of a haem group on two cysteine residues of cytochrome c and c1. These two enzymes are functionally and evolutionary related. There are two conserved regions, the first is located in the central section and the second in the C-terminal section. Both patterns contain conserved histidine, tryptophan and acidic residues which could be important for the interaction of the enzymes with the apoproteins and/or the haem group.

    Proteins where this domain is known:
    PFL0180w    PFL1185c   


    PTHR12746 - PTHR12746 (Panther link)

    Proteins where this domain is known:
    PF07_0121   


    PTHR12749 - PTHR12749 (Panther link)

    Proteins where this domain is known:
    PFB0160w   


    PTHR12750 - PTHR12750 (Panther link)

    Proteins where this domain is known:
    PF14_0282   


    PTHR12750:SF1 - PTHR12750:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0282   


    PTHR12752 - PTHR12752 (Panther link)

    Proteins where this domain is known:
    MAL13P1.306   


    PTHR12755 - PTHR12755 (Panther link)

    Proteins where this domain is known:
    PF08_0040   


    PTHR12756 - PTHR12756 (Panther link)

    Proteins where this domain is known:
    PFA0170c   


    PTHR12760 - PTHR12760 (Panther link)

    Proteins where this domain is known:
    PF14_0098   


    PTHR12763 - PTHR12763 (Panther link)

    Proteins where this domain is known:
    PF07_0103   


    PTHR12765 - PTHR12765 (Panther link)

    Proteins where this domain is known:
    MAL13P1.34   


    PTHR12773 - PTHR12773 (Panther link)

    Proteins where this domain is known:
    PF14_0072   


    PTHR12775 - DUF602 (Panther link)

    Interpro entry IPR006735 : (Interpro link)

    Interpro description:
    This family represents several uncharacterised eukaryotic proteins.

    Proteins where this domain is known:
    PFI1320c   


    PTHR12777 - PTHR12777 (Panther link)

    Proteins where this domain is known:
    PFB0865w   


    PTHR12778 - PTHR12778 (Panther link)

    Proteins where this domain is known:
    PF10_0360   


    PTHR12778:SF4 - PTHR12778:SF4 (Panther link)

    Proteins where this domain is known:
    PF10_0360   


    PTHR12780 - RNA_pol_Rpc34 (Panther link)

    Interpro entry IPR016049 : (Interpro link)

    Interpro description:

    The entry represents a subunit specific of RNA Pol III, the tRNA specific polymerase. The C34 subunit of Saccharomyces cerevisiae RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and therefore participates in Pol III recruitment.

    This entry also includes some homologus archaeal proteins of unknown function.

    Proteins where this domain is known:
    PF14_0207   


    PTHR12785 - PTHR12785 (Panther link)

    Proteins where this domain is known:
    PF14_0587   


    PTHR12785:SF5 - PTHR12785:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0587   


    PTHR12786 - PTHR12786 (Panther link)

    Proteins where this domain is known:
    PFI1215w   


    PTHR12786:SF2 - PTHR12786:SF2 (Panther link)

    Proteins where this domain is known:
    PFI1215w   


    PTHR12787 - mtransfer (Panther link)

    Interpro entry IPR007823 : (Interpro link)

    Interpro description:
    This family consists of uncharacterised eukaryotic proteins which are related to S-adenosyl-L-methionine-dependent methyltransferases.

    Proteins where this domain is known:
    PFI1235w   


    PTHR12789 - PTHR12789 (Panther link)

    Proteins where this domain is known:
    PF08_0079   


    PTHR12801 - PTHR12801 (Panther link)

    Proteins where this domain is known:
    PF13_0208   


    PTHR12801:SF12 - PTHR12801:SF12 (Panther link)

    Proteins where this domain is known:
    PF13_0208   


    PTHR12802 - PTHR12802 (Panther link)

    Proteins where this domain is known:
    PFL1215c   


    PTHR12802:SF4 - PTHR12802:SF4 (Panther link)

    Proteins where this domain is known:
    PFL1215c   


    PTHR12804 - SPC22 (Panther link)

    Interpro entry IPR007653 : Signal peptidase 22 kDa subunit (Interpro link)

    Interpro description:
    Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC), which consists of four subunits in yeast and five in mammals. This family is is described as similar to microsomal signal peptidase 23 kDa subunit. Found in eukaryotes.

    Proteins where this domain is known:
    PFI0215c   


    PTHR12805 - PTHR12805 (Panther link)

    Proteins where this domain is known:
    PF14_0643   


    PTHR12811 - PTHR12811 (Panther link)

    Proteins where this domain is known:
    PFL1935c   


    PTHR12814 - PTHR12814 (Panther link)

    Proteins where this domain is known:
    PFD0905w   


    PTHR12816 - PTHR12816 (Panther link)

    Proteins where this domain is known:
    PF07_0012   


    PTHR12817 - PTHR12817 (Panther link)

    Proteins where this domain is known:
    PFF0905w   


    PTHR12820 - PTHR12820 (Panther link)

    Proteins where this domain is known:
    PF07_0111   


    PTHR12821 - Bystin (Panther link)

    Interpro entry IPR007955 : (Interpro link)

    Interpro description:

    Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the Saccharomyces cerevisiae protein ENP1. ENP1 is an essential protein in S. cerevisiae and is localised in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective S. cerevisiae cannot synthesise 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits.

    Proteins where this domain is known:
    PF11_0105   


    PTHR12826 - PTHR12826 (Panther link)

    Proteins where this domain is known:
    PF14_0661   


    PTHR12826:SF1 - PTHR12826:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0661   


    PTHR12829 - PTHR12829 (Panther link)

    Proteins where this domain is known:
    PF07_0123   


    PTHR12831 - Tfb4 (Panther link)

    Interpro entry IPR004600 : Transcription factor Tfb4 (Interpro link)

    Interpro description:
    Members of this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. The core-TFIIH basal transcription factor complex has six subunits, this is the p34 subunit.

    Proteins where this domain is known:
    PF13_0279   


    PTHR12834 - SRP9 (Panther link)

    Interpro entry IPR008832 : Signal recognition particle, SRP9 subunit (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents the 9 kDa SRP9 component. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.

    Proteins where this domain is known:
    MAL7P1.158   


    PTHR12834:SF1 - PTHR12834:SF1 (Panther link)

    Proteins where this domain is known:
    MAL7P1.158   


    PTHR12835 - BirA_ligase (Panther link)

    Interpro entry IPR004408 : Biotin--acetyl-CoA-carboxylase ligase (Interpro link)

    Interpro description:

    The biotin operon of Escherichia coli contains 5 structural genes involved in the synthesis of biotin. Transcription of the operon is regulated via one of these proteins, the biotin ligase BirA. BirA is an asymetric protein with 3 specific domains - an N-terminal DNA-binding domain, a central catalytic domain and a C-terminal of unknown function. The ligase reaction intermediate, biotinyl-5'-AMP, is the co-repressor that triggers DNA binding by BirA. The alpha-helical N-terminal domain of the BirA protein has the helix-turn-helix structure of DNA-binding proteins with a central DNA recognition helix. BirA undergoes several conformational changes related to repressor function and the N-terminal DNA-binding function is connected to the rest of the molecule through a hinge which will allow relocation of the domains during the reaction. Biotin-binding causes a large structural change thought to facilitate ATP-binding.

    Two repressor molecules form the operator-repressor complex, with dimer formation occuring simultaneously with DNA binding. DNA-binding may also cause a conformational change which allows this co-operative interaction. In the dimer structure, the beta-sheets in the central domain of each monomer are arranged side-by-side to form a single, seamless beta-sheet.

    The apparent orthologs among the eukaryotes are larger proteins that contain a domain with high sequence homology to BirA.

    Proteins where this domain is known:
    PF10_0409    PF14_0573   


    PTHR12838 - Utp11 (Panther link)

    Interpro entry IPR007144 : Small-subunit processome, Utp11 (Interpro link)

    Interpro description:

    A large ribonuclear protein complex is required for the processing of the small-ribosomal-subunit rRNA - the small-subunit (SSU) processome. This preribosomal complex contains the U3 snoRNA and at least 40 proteins, which have the following properties:

    There appears to be a linkage between polymerase I transcription and the formation of the SSU processome; as some, but not all, of the SSU processome components are required for pre-rRNA transcription initiation. These SSU processome components have been termed t-Utps. They form a pre-complex with pre-18S rRNA in the absence of snoRNA U3 and other SSU processome components. It has been proposed that the t-Utp complex proteins are both rDNA and rRNA binding proteins that are involved in the initiation of pre18S rRNA transcription. Initially binding to rDNA then associating with the 5' end of the nascent pre18S rRNA. The t-Utpcomplex forms the nucleus around which the rest of the SSU processome components, including snoRNA U3, assemble. From electron microscopy the SSU processome may correspond to the terminal knobs visualized at the 5' ends of nascent 18S rRNA.

    This entry contains Utp11, a large ribonuclear protein that associates with snoRNA U3.

    Proteins where this domain is known:
    PFL2295w   


    PTHR12843 - PTHR12843 (Panther link)

    Proteins where this domain is known:
    PFI0815c   


    PTHR12849 - PTHR12849 (Panther link)

    Proteins where this domain is known:
    PF13_0222   


    PTHR12850 - Ribosomal_S25 (Panther link)

    Interpro entry IPR004977 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The S25 ribosomal protein is a component of the 40S ribosomal subunit.

    Proteins where this domain is known:
    PF14_0205   


    PTHR12855 - PTHR12855 (Panther link)

    Proteins where this domain is known:
    PFF1385c   


    PTHR12855:SF4 - PTHR12855:SF4 (Panther link)

    Proteins where this domain is known:
    PFF1385c   


    PTHR12857 - DUF866_euk (Panther link)

    Interpro entry IPR008584 : (Interpro link)

    Interpro description:
    This family consists of a number of hypothetical eukaryotic proteins of unknown function with an average length of around 165 residues.

    Proteins where this domain is known:
    MAL13P1.257   


    PTHR12858 - PTHR12858 (Panther link)

    Proteins where this domain is known:
    PF14_0494    PFA0330w   


    PTHR12858:SF1 - PTHR12858:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0494   


    PTHR12858:SF2 - PTHR12858:SF2 (Panther link)

    Proteins where this domain is known:
    PFA0330w   


    PTHR12864 - PTHR12864 (Panther link)

    Proteins where this domain is known:
    MAL13P1.182   


    PTHR12866 - PTHR12866 (Panther link)

    Proteins where this domain is known:
    PFI0280c   


    PTHR12867 - PTHR12867 (Panther link)

    Proteins where this domain is known:
    MAL8P1.133   


    PTHR12869 - PTHR12869 (Panther link)

    Proteins where this domain is known:
    PF10_0070   


    PTHR12873 - PTHR12873 (Panther link)

    Proteins where this domain is known:
    PF14_0112   


    PTHR12874 - PTHR12874 (Panther link)

    Proteins where this domain is known:
    PFL2255w   


    PTHR12874:SF5 - PTHR12874:SF5 (Panther link)

    Proteins where this domain is known:
    PFL2255w   


    PTHR12875 - DUF410 (Panther link)

    Interpro entry IPR007317 : (Interpro link)

    Interpro description:
    This is a family of conserved eukaryotic proteins with undetermined function.

    Proteins where this domain is known:
    PF14_0365   


    PTHR12882 - Spt4 (Panther link)

    Interpro entry IPR016046 : (Interpro link)

    Interpro description:

    This family consists of several eukaryotic transcription initiation Spt4 proteins and some related archaeal sequences. Three transcription-elongation factors Spt4, Spt5, and Spt6 are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. Spt4 and Spt5 are tightly associated in a complex, while the physical association of the Spt4-Spt5 complex with Spt6 is considerably weaker. It has been demonstrated that Spt4, Spt5, and Spt6 play roles in transcription elongation in both yeast and humans including a role in activation by Tat. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles.

    Proteins where this domain is known:
    PF10_0293   


    PTHR12883 - DUF1682 (Panther link)

    Interpro entry IPR012879 : (Interpro link)

    Interpro description:

    The members of this family are all hypothetical eukaryotic proteins of unknown function. One member is described as being an adipocyte-specific protein, but no evidence of this was found.

    Proteins where this domain is known:
    PFC0835c   


    PTHR12886 - Mannos_trans_DXD (Panther link)

    Interpro entry IPR007704 : Mannosyltransferase, DXD (Interpro link)

    Interpro description:
    PIG-M has a DXD motif. The DXD motif is found in many glycosyltransferases that utilise nucleotide sugars. It is thought that the motif is involved in the binding of a manganese ion that is required for association of the enzymes with nucleotide sugar substrates.

    Proteins where this domain is known:
    PFL0540w   


    PTHR12893 - GRASP55_65 (Panther link)

    Interpro entry IPR007583 : (Interpro link)

    Interpro description:
    GRASP55 (Golgi reassembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide-sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system.

    Proteins where this domain is known:
    PF10_0168   


    PTHR12894 - CNH DOMAIN CONTAINING (Panther link)

    Proteins where this domain is known:
    PF14_0229   


    PTHR12894:SF10 - VAM6/VPS39 RELATED (Panther link)

    Proteins where this domain is known:
    PF14_0229   


    PTHR12897 - PTHR12897 (Panther link)

    Proteins where this domain is known:
    PFL1275c   


    PTHR12901 - SPERM PROTEIN HOMOLOG (Panther link)

    Proteins where this domain is known:
    MAL8P1.300   


    PTHR12903 - Ribosomal_L24 (Panther link)

    Interpro entry IPR003256 : Ribosomal protein L24 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L24 is one of the proteins from the large ribosomal subunit. In their mature form, these proteins have 103 to 150 amino-acid residues. This domain is found in L24 and L26 ribosomal proteins.

    Proteins where this domain is known:
    PFF0245w    PFL1150c   


    PTHR12906 - Rab5ip (Panther link)

    Interpro entry IPR010742 : (Interpro link)

    Interpro description:

    This family consists of several Rab5-interacting protein (RIP5 or Rab5ip) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5.

    Proteins where this domain is known:
    PFD0669c   


    PTHR12911 - PTHR12911 (Panther link)

    Proteins where this domain is known:
    PFL0730w   


    PTHR12917 - PTHR12917 (Panther link)

    Proteins where this domain is known:
    PF14_0090   


    PTHR12917:SF1 - PTHR12917:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0090   


    PTHR12919 - Ribosomal_S16 (Panther link)

    Interpro entry IPR000307 : Ribosomal protein S16 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    S16 proteins have about 100 amino-acid residues.

    Proteins where this domain is known:
    PFE1560c   


    PTHR12919:SF5 - PTHR12919:SF5 (Panther link)

    Proteins where this domain is known:
    PFE1560c   


    PTHR12922 - PTHR12922 (Panther link)

    Proteins where this domain is known:
    PF11_0128   


    PTHR12930 - PTHR12930 (Panther link)

    Proteins where this domain is known:
    PF14_0416   


    PTHR12932 - P25-alpha (Panther link)

    Interpro entry IPR008907 : (Interpro link)

    Interpro description:
    This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila.

    Proteins where this domain is known:
    PFL1770c   


    PTHR12932:SF9 - PTHR12932:SF9 (Panther link)

    Proteins where this domain is known:
    PFL1770c   


    PTHR12933 - DUF1253 (Panther link)

    Interpro entry IPR010678 : (Interpro link)

    Interpro description:

    This family is defined by a C-terminal region of approximately 500 residues, which occurs in several hypothetical eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PF10_0054   


    PTHR12934 - Ribosom_L15_bac (Panther link)

    Interpro entry IPR005749 : Ribosomal protein L15, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents ribosomal protein L15 and homologues found in bacteria, chloroplasts and mitochondria.

    Proteins where this domain is known:
    PF14_0270    PF14_0276   


    PTHR12934:SF1 - PTHR12934:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0276   


    PTHR12934:SF2 - PTHR12934:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0270   


    PTHR12936 - APC10 (Panther link)

    Interpro entry IPR004939 : Anaphase-promoting complex, subunit 10 (Interpro link)

    Interpro description:

    The anaphase-promoting complex (APC) is a multi-subunit E3 protein ubiquitin ligase that is responsible for the metaphase to anaphase transition and the exit from mitosis. Anaphase is initiated when the APC triggers the destruction of securin, thereby allowing the protease, separase, to disrupt sister-chromatid cohesion. Securin ubiquitination by the APC is inhibited by cyclin-dependent kinase 1 (Cdk1)-dependent phosphorylation.

    Forkhead Box M1 (FoxM1), which is a transcription factor that is over-expressed in many cancers, is degraded in late mitosis and early G1 phase by the APC/cyclosome (APC/C) E3 ubiquitin ligase. The APC/C targets mitotic cyclins for destruction in mitosis and G1 phase and is then inactivated at S phase. It thereby generates alternating states of high and low cyclin-Cdk activity, which is required for the alternation of mitosis and DNA replication.

    APC from Schizosaccharomyces pombe and Saccharomyces cerevisiae was previously thought to have 11 subunits, but more sensitive techniques have identified 13 subunits in both yeasts.

    One of the subunits of the APC that is required for ubiquitination activity is APC10, a one-domain protein homologous to a sequence element, termed the DOC domain, found in several hypothetical proteins that may also mediate ubiquitination reactions, because they contain combinations of either RING finger (see, cullin (see or HECT (see domains.

    The DOC domain consists of a beta-sandwich, in which a five-stranded antiparallel beta-sheet is packed on top of a three stranded antiparallel beta-sheet, exhibiting a 'jellyroll' fold.

    Proteins known to contain a DOC domain include:

    Proteins where this domain is known:
    PFL0850w   


    PTHR12941 - UPF0172 (Panther link)

    Interpro entry IPR005366 : (Interpro link)

    Interpro description:

    This is a small family of proteins of unknown function.

    Proteins where this domain is known:
    PF11_0409   


    PTHR12942 - PTHR12942 (Panther link)

    Proteins where this domain is known:
    MAL13P1.242    PFF0500c   


    PTHR12944 - PTHR12944 (Panther link)

    Proteins where this domain is known:
    PFL0255c   


    PTHR12945 - EIF3_gamma (Panther link)

    Interpro entry IPR007316 : Eukaryotic initiation factor 3, gamma subunit (Interpro link)

    Interpro description:
    eIF-3 is a multisubunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit of eIF3.

    Proteins where this domain is known:
    PFI0625c   


    PTHR12953 - PTHR12953 (Panther link)

    Proteins where this domain is known:
    PF14_0372   


    PTHR12963 - PTHR12963 (Panther link)

    Proteins where this domain is known:
    PF13_0209   


    PTHR12975 - PTHR12975 (Panther link)

    Proteins where this domain is known:
    PFE0250w   


    PTHR12982 - GPI2 (Panther link)

    Interpro entry IPR009450 : Phosphatidylinositol N-acetylglucosaminyltransferase (Interpro link)

    Interpro description:

    Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively.

    Proteins where this domain is known:
    PFI0535w   


    PTHR12984 - PTHR12984 (Panther link)

    Proteins where this domain is known:
    PFE0170c   


    PTHR12984:SF2 - PTHR12984:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0170c   


    PTHR12993 - PIGL/MshB (Panther link)

    Interpro entry IPR003737 : (Interpro link)

    Interpro description:

    A number of the members of this family have been characterised as a probable N-acetylglucosaminyl-phosphatidylinositol de-N-acetylase, that catalyses the second step in glycosylphosphatidylinositol (GPI) biosynthesis.

    Proteins where this domain is known:
    PFF1190c   


    PTHR12993:SF2 - PTHR12993:SF2 (Panther link)

    Proteins where this domain is known:
    PFF1190c   


    PTHR12998 - PTHR12998 (Panther link)

    Proteins where this domain is known:
    PF11_0316   


    PTHR12999 - PTHR12999 (Panther link)

    Proteins where this domain is known:
    PFD0405c   


    PTHR12999:SF3 - PTHR12999:SF3 (Panther link)

    Proteins where this domain is known:
    PFD0405c   


    PTHR13007 - PTHR13007 (Panther link)

    Proteins where this domain is known:
    PF11_0540    PFI1115c   


    PTHR13007:SF1 - PTHR13007:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0540   


    PTHR13007:SF5 - PTHR13007:SF5 (Panther link)

    Proteins where this domain is known:
    PFI1115c   


    PTHR13009 - HEAT SHOCK PROTEIN 90 (HSP90) CO-CHAPERONE AHA-1 (Panther link)

    Proteins where this domain is known:
    PFC0270w    PFC0360w   


    PTHR13016 - PTHR13016 (Panther link)

    Proteins where this domain is known:
    MAL13P1.172   


    PTHR13018 - PTHR13018 (Panther link)

    Proteins where this domain is known:
    PF11_0341    PFL2410w   


    PTHR13018:SF3 - PTHR13018:SF3 (Panther link)

    Proteins where this domain is known:
    PF11_0341   


    PTHR13018:SF4 - PTHR13018:SF4 (Panther link)

    Proteins where this domain is known:
    PFL2410w   


    PTHR13019 - DUF846_euk (Panther link)

    Interpro entry IPR008564 : Protein of unknown function DUF846, eukaryotic (Interpro link)

    Interpro description:
    This family consists of a number of conserved eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PF14_0562   


    PTHR13020 - PTHR13020 (Panther link)

    Proteins where this domain is known:
    PFI1525w   


    PTHR13020:SF4 - PTHR13020:SF4 (Panther link)

    Proteins where this domain is known:
    PFI1525w   


    PTHR13021 - Isy1 (Panther link)

    Interpro entry IPR009360 : (Interpro link)

    Interpro description:

    Isy1 protein is important in the optimisation of splicing.

    Proteins where this domain is known:
    PF14_0688   


    PTHR13021:SF4 - PTHR13021:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0688   


    PTHR13022 - EIF-3_p25 (Panther link)

    Interpro entry IPR009374 : Translation initiation factor 3, subunit 12, eukaryotic (Interpro link)

    Interpro description:

    This family consists of several eukaryotic translation initiation factor 3 subunit 12 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits.

    Proteins where this domain is known:
    PFC0441c   


    PTHR13027 - DUF254_SAND (Panther link)

    Interpro entry IPR004353 : (Interpro link)

    Interpro description:

    Members of this family have been called SAND proteins although these proteins do not contain a SAND domain. In Saccharomyces cerevisiae a protein complex of Mon1 and Ccz1 functions with the small GTPase Ypt7 to mediate vesicle trafficking to the vacuole. The Mon1/Ccz1 complex is conserved in eukaryotic evolution and members of this family (previously known as DUF254) are distant homologues to domains of known structure that assemble into cargo vesicle adapter (AP) complexes.

    Proteins where this domain is known:
    PF13_0274   


    PTHR13028 - Ebp2 (Panther link)

    Interpro entry IPR008610 : (Interpro link)

    Interpro description:
    This family consists of several eukaryotic rRNA processing protein EBP2 sequences. Ebp2p is required for the maturation of 25S rRNA and 60S subunit assembly. Ebp2p may be one of the target proteins of Rrs1p for executing the signal to regulate ribosome biogenesis.

    Proteins where this domain is known:
    PF10_0277   


    PTHR13040 - APG5 (Panther link)

    Interpro entry IPR007239 : Autophagy protein 5 (Interpro link)

    Interpro description:
    Macroautophagy is a bulk degradation process induced by starvation in eukaryotic cells. In yeast, 15 Apg proteins coordinate the formation of autophagosomes. No molecule involved in autophagy has yet been identified in higher eukaryotes. The pre-autophagosomal structure contains at least five Apg proteins: Apg1p, Apg2p, Apg5p, Aut7p/Apg8p and Apg16p. It is found in the vacuole. The C-terminal glycine of Apg12p is conjugated to a lysine residue of Apg5p via an isopeptide bond. During autophagy, cytoplasmic components are enclosed in autophagosomes and delivered to lysosomes/vacuoles. Auotphagy protein 16 (Apg16) has been shown to be bind to Apg5 and is required for the function of the Apg12p-Apg5p conjugate. Autophagy protein 5 (Apg5) is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway. This entry represents autophagy protein 5 (Apg5).

    Proteins where this domain is known:
    PF14_0283   


    PTHR13042 - PTHR13042 (Panther link)

    Proteins where this domain is known:
    PFL1830w   


    PTHR13047 - PTHR13047 (Panther link)

    Proteins where this domain is known:
    PFA0450c   


    PTHR13048 - PTHR13048 (Panther link)

    Proteins where this domain is known:
    PFD0895c   


    PTHR13061 - PTHR13061 (Panther link)

    Proteins where this domain is known:
    PF07_0098   


    PTHR13063 - PTHR13063 (Panther link)

    Proteins where this domain is known:
    PFE0900w   


    PTHR13063:SF1 - PTHR13063:SF1 (Panther link)

    Proteins where this domain is known:
    PFE0900w   


    PTHR13069 - PTHR13069 (Panther link)

    Proteins where this domain is known:
    PF10_0274   


    PTHR13072 - PTHR13072 (Panther link)

    Proteins where this domain is known:
    MAL8P1.68   


    PTHR13082 - SAP18 (Panther link)

    Interpro entry IPR010516 : (Interpro link)

    Interpro description:

    This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex, which is responsible for the repression of transcription via the modification of histone polypeptides. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death.

    Proteins where this domain is known:
    MAL7P1.37   


    PTHR13087 - DUF926 (Panther link)

    Interpro entry IPR009269 : (Interpro link)

    Interpro description:

    This is a family of eukaryotic proteins with undetermined function.

    Proteins where this domain is known:
    PF07_0030   


    PTHR13097 - PTHR13097 (Panther link)

    Proteins where this domain is known:
    MAL7P1.86   


    PTHR13107 - PTHR13107 (Panther link)

    Proteins where this domain is known:
    PFL1715w   


    PTHR13108 - PTHR13108 (Panther link)

    Proteins where this domain is known:
    MAL13P1.21   


    PTHR13108:SF4 - PTHR13108:SF4 (Panther link)

    Proteins where this domain is known:
    MAL13P1.21   


    PTHR13110 - PTHR13110 (Panther link)

    Proteins where this domain is known:
    PF08_0049   


    PTHR13112 - PTHR13112 (Panther link)

    Proteins where this domain is known:
    PF13_0158   


    PTHR13116 - DUF850_TM_euk (Panther link)

    Interpro entry IPR008568 : (Interpro link)

    Interpro description:
    This family consists of eukaryotic putative transmembrane proteins of unknown function.

    Proteins where this domain is known:
    MAL13P1.299   


    PTHR13119 - PTHR13119 (Panther link)

    Proteins where this domain is known:
    PF10_0186    PF14_0610   


    PTHR13119:SF3 - PTHR13119:SF3 (Panther link)

    Proteins where this domain is known:
    PF10_0186    PF14_0610   


    PTHR13120 - PHF5 (Panther link)

    Interpro entry IPR005345 : (Interpro link)

    Interpro description:

    Phf5 is a member of a novel murine multigene family that is highly conserved during evolution and belongs to the superfamily of PHD-finger proteins. At least one example, from Mus musculus (Mouse), may act as a chromatin-associated protein. The Schizosaccharomyces pombe (Fission yeast) ini1 gene is essential, required for splicing. It is localised in the nucleus, but not detected in the nucleolus and can be complemented by human ini1. The proteins of this family contain five CXXC motifs.

    Proteins where this domain is known:
    PF10_0179   


    PTHR13121 - PIG-U (Panther link)

    Interpro entry IPR009600 : GPI transamidase subunit PIG-U (Interpro link)

    Interpro description:

    Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the C terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI.

    Proteins where this domain is known:
    MAL13P1.165   


    PTHR13124 - PTHR13124 (Panther link)

    Proteins where this domain is known:
    PFF1305w   


    PTHR13126 - ATP11 (Panther link)

    Interpro entry IPR010591 : ATP11 (Interpro link)

    Interpro description:

    This family consists of several eukaryotic ATP11 proteins. The expression of functional F1-ATPase requires two proteins which are encoded by the ATP11 and ATP12 genes. Atp11p is a molecular chaperone of the mitochondrial matrix that participates in the biogenesis pathway to form F1, which is the catalytic unit of ATP synthase. It binds to the free beta subunits of F1, which prevents the beta subunit from associating with itself in non-productive complex. It also allows for the formation of a (alpha beta)3 hexamer.

    Proteins where this domain is known:
    PFL0490c   


    PTHR13140 - PTHR13140 (Panther link)

    Proteins where this domain is known:
    MAL13P1.148    PF11_0416    PF13_0233    PFE0175c    PFF0675c    PFL1435c   


    PTHR13140:SF19 - PTHR13140:SF19 (Panther link)

    Proteins where this domain is known:
    PFF0675c   


    PTHR13140:SF24 - gb def: Myosin D (Fragment) (Panther link)

    Proteins where this domain is known:
    PFL1435c   


    PTHR13140:SF25 - PTHR13140:SF25 (Panther link)

    Proteins where this domain is known:
    PFE0175c   


    PTHR13140:SF31 - MYOSIN I (Panther link)

    Proteins where this domain is known:
    PF13_0233   


    PTHR13140:SF34 - PTHR13140:SF34 (Panther link)

    Proteins where this domain is known:
    MAL13P1.148   


    PTHR13140:SF35 - PTHR13140:SF35 (Panther link)

    Proteins where this domain is known:
    PF11_0416   


    PTHR13146 - PTHR13146 (Panther link)

    Proteins where this domain is known:
    PF07_0070   


    PTHR13152 - Tfb2 (Panther link)

    Interpro entry IPR004598 : Transcription factor Tfb2 (Interpro link)

    Interpro description:
    Members of this family are part of the TFIIH complex which is involved in the initiation of transcription and nucleotide excision repair. The core-TFIIH basal transcription factor complex has six subunits, this is the p52 subunit.

    Proteins where this domain is known:
    PFL2125c   


    PTHR13159 - Radial_spoke (Panther link)

    Interpro entry IPR006802 : (Interpro link)

    Interpro description:
    This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene.

    Proteins where this domain is known:
    PF11_0057   


    PTHR13162 - PTHR13162 (Panther link)

    Proteins where this domain is known:
    PF11_0049    PF14_0170   


    PTHR13164 - PTHR13164 (Panther link)

    Proteins where this domain is known:
    PFL1845c   


    PTHR13166 - PTHR13166 (Panther link)

    Proteins where this domain is known:
    MAL13P1.53   


    PTHR13168 - PTHR13168 (Panther link)

    Proteins where this domain is known:
    PF07_0060   


    PTHR13173 - PTHR13173 (Panther link)

    Proteins where this domain is known:
    PF14_0026   


    PTHR13173:SF10 - PTHR13173:SF10 (Panther link)

    Proteins where this domain is known:
    PF14_0026   


    PTHR13182 - PTHR13182 (Panther link)

    Proteins where this domain is known:
    PFD0485w   


    PTHR13186 - SOH1 (Panther link)

    Interpro entry IPR008831 : SOH1 (Interpro link)

    Interpro description:
    The family consists of Saccharomyces cerevisiae SOH1 homologues. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes.

    Proteins where this domain is known:
    PF14_0718   


    PTHR13202 - SPC12 (Panther link)

    Interpro entry IPR009542 : Microsomal signal peptidase 12 kDa subunit (Interpro link)

    Interpro description:

    This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12).

    Proteins where this domain is known:
    PF14_0317   


    PTHR13205 - PTHR13205 (Panther link)

    Proteins where this domain is known:
    PFA0485w   


    PTHR13211 - PTHR13211 (Panther link)

    Proteins where this domain is known:
    PF14_0640   


    PTHR13227 - PTHR13227 (Panther link)

    Proteins where this domain is known:
    PF10_0183    PF14_0360   


    PTHR13230 - PTHR13230 (Panther link)

    Proteins where this domain is known:
    PFL0520c   


    PTHR13232 - PTHR13232 (Panther link)

    Proteins where this domain is known:
    PF14_0570   


    PTHR13238 - PTHR13238 (Panther link)

    Proteins where this domain is known:
    PFI1165c   


    PTHR13242 - PTHR13242 (Panther link)

    Proteins where this domain is known:
    PFF0590c   


    PTHR13244 - PTHR13244 (Panther link)

    Proteins where this domain is known:
    PFF0350w   


    PTHR13247 - PTHR13247 (Panther link)

    Proteins where this domain is known:
    MAL13P1.139   


    PTHR13261 - PTHR13261 (Panther link)

    Proteins where this domain is known:
    PF13_0136   


    PTHR13264 - mRNA_splic_SYF2 (Panther link)

    Interpro entry IPR013260 : (Interpro link)

    Interpro description:

    Proteins in this entry are involved in cell cycle progression and pre-mRNA splicing.

    Proteins where this domain is known:
    PF08_0128   


    PTHR13273 - DUF689 (Panther link)

    Interpro entry IPR007785 : (Interpro link)

    Interpro description:
    This family contains several uncharacterised eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    MAL8P1.31   


    PTHR13273:SF2 - PTHR13273:SF2 (Panther link)

    Proteins where this domain is known:
    MAL8P1.31   


    PTHR13282 - DUF1754_euk (Panther link)

    Interpro entry IPR013865 : (Interpro link)

    Interpro description:

    This is a eukaryotic protein of unknown function.

    Proteins where this domain is known:
    PFL1595w   


    PTHR13288 - PTHR13288 (Panther link)

    Proteins where this domain is known:
    PF14_0513   


    PTHR13288:SF8 - PTHR13288:SF8 (Panther link)

    Proteins where this domain is known:
    PF14_0513   


    PTHR13296 - PTHR13296 (Panther link)

    Proteins where this domain is known:
    PFF0695w   


    PTHR13303 - PTHR13303 (Panther link)

    Proteins where this domain is known:
    PF14_0167   


    PTHR13304 - Gaa1 (Panther link)

    Interpro entry IPR007246 : Gaa1-like, GPI transamidase component (Interpro link)

    Interpro description:

    GPI (glycosyl phosphatidyl inositol) transamidase is a multiprotein complex required for a terminal step of adding the glycosylphosphatidylinositol (GPI) anchor attachment onto proteins. Gpi16, Gpi8 and Gaa1 form a sub-complex of the GPI transamidase.

    Proteins where this domain is known:
    MAL13P1.348   


    PTHR13305 - Nop10p_RNA_bd (Panther link)

    Interpro entry IPR007264 : (Interpro link)

    Interpro description:
    Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs.

    Proteins where this domain is known:
    PF14_0784   


    PTHR13317 - DUF747_CMV_rcpt (Panther link)

    Interpro entry IPR008010 : (Interpro link)

    Interpro description:

    This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH but this has has since been disputed. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton.

    Proteins where this domain is known:
    PFC0970w   


    PTHR13326 - PsU_synth_TruD (Panther link)

    Interpro entry IPR001656 : tRNA pseudouridine synthase D (Interpro link)

    Interpro description:

    This entry represents tRNA pseudouridine synthase D (TruD) proteins, which appear to be responsible for synthesis of pseudouridine from uracil-13 in transfer RNAs. They are hydrophilic proteins of from 39 to 77 kDa and homologues are found in bacteria, archaea, and eukarya.

    Proteins where this domain is known:
    PF07_0125    PF10_0341   


    PTHR13348 - PTHR13348 (Panther link)

    Proteins where this domain is known:
    PFF1355w   


    PTHR13369 - PTHR13369 (Panther link)

    Proteins where this domain is known:
    PFE1535w   


    PTHR13370 - PTHR13370 (Panther link)

    Proteins where this domain is known:
    PF13_0236   


    PTHR13371 - PTHR13371 (Panther link)

    Proteins where this domain is known:
    PFL1510c   


    PTHR13389 - PTHR13389 (Panther link)

    Proteins where this domain is known:
    PFF1030w   


    PTHR13393 - DUF890 (Panther link)

    Interpro entry IPR010286 : S-adenosyl-L-methionine dependent methyltransferase, predicted (Interpro link)

    Interpro description:

    This family consists of several conserved hypothetical proteins from both eukaryotes and prokaryotes. The function of members of this family are unknown but are predicted to be SAM-dependent methyltransferases.

    Proteins where this domain is known:
    PF14_0115   


    PTHR13398 - PTHR13398 (Panther link)

    Proteins where this domain is known:
    PFI0445c   


    PTHR13408 - RNA_pol_Rpc4 (Panther link)

    Interpro entry IPR007811 : RNA polymerase III Rpc4 (Interpro link)

    Interpro description:
    This family comprises a specific subunit for Pol III, the tRNA specific polymerase.

    Proteins where this domain is known:
    PF14_0603   


    PTHR13412 - PTHR13412 (Panther link)

    Proteins where this domain is known:
    PFE1445c   


    PTHR13420 - DUF167 (Panther link)

    Interpro entry IPR003746 : (Interpro link)

    Interpro description:

    This entry describes proteins of unknown function. Structures for two of these proteins, YggU from Escherichia coli and MTH637 from the archaea Methanobacterium thermoautotrophicum, have been determined; they have a core 2-layer alpha/beta structure consisting of beta(2)-loop-alpha-beta(2)-alpha.

    Proteins where this domain is known:
    PF14_0542   


    PTHR13421 - PTHR13421 (Panther link)

    Proteins where this domain is known:
    PF14_0580   


    PTHR13421:SF3 - PTHR13421:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0580   


    PTHR13451 - PTHR13451 (Panther link)

    Proteins where this domain is known:
    PF14_0470   


    PTHR13476 - PTHR13476 (Panther link)

    Proteins where this domain is known:
    MAL8P1.114   


    PTHR13479 - Ribosomal_S18 (Panther link)

    Interpro entry IPR001648 : Ribosomal protein S18 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.

    Proteins where this domain is known:
    PFL0570c   


    PTHR13479:SF16 - PTHR13479:SF16 (Panther link)

    Proteins where this domain is known:
    PFL0570c   


    PTHR13486 - Hep_59 (Panther link)

    Interpro entry IPR010756 : (Interpro link)

    Interpro description:

    This family represents a conserved region approximately 100 residues long within mammalian hepatocellular carcinoma-associated antigen 59 and similar proteins. Family members are found in a variety of eukaryotes, mainly as hypothetical proteins.

    Proteins where this domain is known:
    PF14_0650   


    PTHR13486:SF2 - PTHR13486:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0650   


    PTHR13489 - PTHR13489 (Panther link)

    Proteins where this domain is known:
    PF14_0120   


    PTHR13507 - DUF1168 (Panther link)

    Interpro entry IPR009548 : (Interpro link)

    Interpro description:

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PFD0785c   


    PTHR13509 - PTHR13509 (Panther link)

    Proteins where this domain is known:
    MAL8P1.51   


    PTHR13516 - PTHR13516 (Panther link)

    Proteins where this domain is known:
    MAL13P1.233    PF08_0074   


    PTHR13522 - PTHR13522 (Panther link)

    Proteins where this domain is known:
    MAL13P1.219   


    PTHR13527 - PTHR13527 (Panther link)

    Proteins where this domain is known:
    MAL8P1.26   


    PTHR13542 - PTHR13542 (Panther link)

    Proteins where this domain is known:
    PFF0580w   


    PTHR13563 - DUF425 (Panther link)

    Interpro entry IPR007356 : (Interpro link)

    Interpro description:

    In transfer RNA many different modified nucleosides are found, especially in the anticodon region. tRNA (guanine-N1-)-methyltransferaseis one of several nucleases operating together with the tRNA-modifying enzymes before the formation of the mature tRNA. It catalyses the reaction:

     S-adenosyl-L-methionine + tRNA -> S-adenosyl-L-homocysteine + tRNA containing                  N1-methylguanine 
    methylating guanosine(G) to N1-methylguanine (1-methylguanosine (m1G)) at position 37 of tRNAs that read CUN (leucine), CCN(proline), and CGG (arginine) codons. The presence of m1G improves the cellular growth rate and the polypeptide steptime and also prevents the tRNA from shifting the reading frame.

    The mechanism of the trmD3-induced frameshift involving mutant tRNA(Pro) and tRNA(Leu) species has been investigated. It has been suggested that the conformation of the anticodon loop may be a major determining element for the formation of m1G37 in vivo.

    Family member HYNA is the product of a novel gene expressed in human liver cancer tissue.

    Proteins where this domain is known:
    PF11_0198   


    PTHR13563:SF6 - PTHR13563:SF6 (Panther link)

    Proteins where this domain is known:
    PF11_0198   


    PTHR13586 - PTHR13586 (Panther link)

    Proteins where this domain is known:
    PF14_0717   


    PTHR13600 - LCM_mtfrase (Panther link)

    Interpro entry IPR007213 : Leucine carboxyl methyltransferase (Interpro link)

    Interpro description:

    This entry represents a group of leucine carboxymethyltransferases which methylate the carboxyl group of leucine residues to form alpha-leucine ester residues. It includes LCTM1 which regulates the activity of serine/threonine phosphatase 2A (PP2A) through methylation of the C-terminal leucine residue of the catalytic subunit of PP2A . This affects the heteromultimeric composition of PP2A which in turn affects protein recognition and substrate specificity. Like many other methyltransferases LCTM1 uses S-adenosylmethionine (SAM) as the methyl donor. LCTM1 contains the common SAM-dependent methyltransferase core fold, with various insertions and additions creating a specific PP2A binding site. This entry also contains LCTM2, a homologue of LCTM1 which is not necessary for PP2A methylation and whose function is not clear.

    Proteins where this domain is known:
    PF14_0376   


    PTHR13604 - DUF159 (Panther link)

    Interpro entry IPR003738 : (Interpro link)

    Interpro description:

    This entry describes proteins of unknown function.

    Proteins where this domain is known:
    PFL0105w   


    PTHR13620 - PTHR13620 (Panther link)

    Proteins where this domain is known:
    PFA0290w   


    PTHR13622 - PTHR13622 (Panther link)

    Proteins where this domain is known:
    PFI1195c   


    PTHR13622:SF1 - PTHR13622:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1195c   


    PTHR13634 - PTHR13634 (Panther link)

    Proteins where this domain is known:
    PF07_0122    PFB0720c   


    PTHR13651 - PTHR13651 (Panther link)

    Proteins where this domain is known:
    PF13_0017   


    PTHR13680 - PTHR13680 (Panther link)

    Proteins where this domain is known:
    PFC0126c   


    PTHR13680:SF2 - PTHR13680:SF2 (Panther link)

    Proteins where this domain is known:
    PFC0126c   


    PTHR13683 - Peptidase_A1 (Panther link)

    Interpro entry IPR001461 : Peptidase A1 (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    This group of aspartic peptidases belong to MEROPS peptidase family A1 (pepsin family, clan AA). The type example is pepsin A from Homo sapiens (Human) .

    More than 70 aspartic peptidases, from all from eukaryotic organisms, have been identified. These include pepsins, cathepsins, and renins. The enzymes are synthesised with signal peptides, and the proenzymes are secreted or passed into the lysosomal/endosomal system, where acidification leads to autocatalytic activation.

    Most members of the pepsin family specifically cleave bonds in peptides that are at least six residues in length, with hydrophobic residues in both the P1 and P1' positions. Crystallography has shown the active site to form a groove across the junction of the two lobes, with an extended loop projecting over the cleft to form an 11-residue flap, which encloses substrates and inhibitors within the active site. Specificity is determined by several hydrophobic residues surrounding the catalytic aspartates, and by three residues in the flap. Cysteine residues are well conserved within the pepsin family, pepsin itself containing three disulphide loops. The first loop is found in all but the fungal enzymes, and is usually around five residues in length, but is longer in barrierpepsin and candidapepsin; the second loop is also small and found only in the animal enzymes; and the third loop is the largest, found in all members of the family, except for the cysteine-free polyporopepsin. The loops are spread unequally throughout the two lobes, suggesting that they formed after the initial gene duplication and fusion event.

    This family does not include the retroviral nor retrotransposon aspartic proteases which are much smaller and appear to be homologous to the single domain aspartic proteases.

    Proteins where this domain is known:
    PF08_0108    PF10_0329    PF13_0133    PF14_0075    PF14_0076    PF14_0077    PF14_0078    PF14_0281    PF14_0625    PFC0495w    PFL1660c   


    PTHR13683:SF45 - PTHR13683:SF45 (Panther link)

    Proteins where this domain is known:
    PF14_0075    PF14_0076    PF14_0077    PF14_0078   


    PTHR13683:SF92 - PEPSINOGEN A-RELATED (Panther link)

    Proteins where this domain is known:
    PF08_0108    PF14_0281   


    PTHR13683:SF93 - PTHR13683:SF93 (Panther link)

    Proteins where this domain is known:
    PF14_0625   


    PTHR13683:SF96 - ASPARTYL PROTEASE-RELATED (Panther link)

    Proteins where this domain is known:
    PF13_0133   


    PTHR13683:SF98 - PTHR13683:SF98 (Panther link)

    Proteins where this domain is known:
    PFL1660c   


    PTHR13691 - Ribosomal_L2 (Panther link)

    Interpro entry IPR002171 : Ribosomal protein L2 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L2 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L2 is known to bind to the 23S rRNA and to have peptidyltransferase activity. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    Proteins where this domain is known:
    PF11_0337    PFE0845c   


    PTHR13691:SF4 - PTHR13691:SF4 (Panther link)

    Proteins where this domain is known:
    PFE0845c   


    PTHR13691:SF5 - Ribosom_L2_bac (Panther link)

    Interpro entry IPR005880 : Ribosomal protein L2, bacterial-type (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The protein L2 is found in all ribosomes and is one of the best conserved proteins of this mega-dalton complex. L2 is elongated, exposing one end of the protein to the surface of the intersubunit interface of the 50 S subunit and is essential for the association of the ribosomal subunits and might participate in the binding and translocation of the tRNAs. This entry represents bacterial, chloroplast and mitochondrial forms.

    Proteins where this domain is known:
    PF11_0337   


    PTHR13693 - CLASS II AMINOTRANSFERASE/8-AMINO-7-OXONONANOATE SYNTHASE (Panther link)

    Proteins where this domain is known:
    PF14_0155    PFL2210w   


    PTHR13693:SF6 - PTHR13693:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0155   


    PTHR13693:SF7 - 5-AMINOLEVULINIC ACID SYNTHASE (Panther link)

    Proteins where this domain is known:
    PFL2210w   


    PTHR13697 - Ppfruckinase (Panther link)

    Interpro entry IPR000023 : Phosphofructokinase (Interpro link)

    Interpro description:
    The enzyme-catalysed transfer of a phosphoryl group from ATP is an important reaction in a wide variety of biological processes. One enzyme that utilises this reaction is phosphofructokinase (PFK), which catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6- bisphosphate, a key regulatory step in the glycolytic pathway. PFK exists as a homotetramer in bacteria and mammals (where each monomer possesses 2 similar domains), and as an octomer in yeast (where there are 4 alpha- (PFK1) and 4 beta-chains (PFK2), the latter, like the mammalian monomers, possessing 2 similar domains).

    PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.

    Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.

    Proteins where this domain is known:
    PF11_0294    PFI0755c   


    PTHR13710 - RecQ (Panther link)

    Interpro entry IPR004589 : DNA helicase, ATP-dependent, RecQ type (Interpro link)

    Interpro description:

    The ATP-dependent DNA helicase RecQ is involved in genome maintenance. All homologues tested to date unwind paired DNA, translocating in a 3' to 5' direction and several have a preference for forked or 4-way DNA structures (e.g. Holliday junctions) or for G-quartet DNA. The yeast protein, Sgs1, is present in numerous foci that coincide with sites of de novo synthesis DNA, such as the replication fork, and protein levels peak during S-phase.

    A model has been proposed for Sgs1p action in the S-phase checkpoint response, both as a 'sensor' for damage during replication and a 'resolvase' for structures that arise at paused forks, such as the four-way 'chickenfoot' structure. The action of Sgs1p may serve to maintain the proper amount and integrity of ss DNA that is necessary for the binding of RPA (replication protein A, the eukaryotic ss DNA-binding protein)ÂDNA pol complexes. Sgs1p would thus function by detecting (or resolving) aberrant DNA structures, and would thus contribute to the full activation of the DNA-dependent protein kinase, Mec1p and the effector kinase, Rad53p. Its ability to bind both the large subunit of RPA and the RecA-like protein Rad51p, place it in a unique position to resolve inappropriate fork structures that can occur when either the leading or lagging strand synthesis is stalled. Thus, RecQ helicases integrate checkpoint activation and checkpoint response.

    Proteins where this domain is known:
    PF14_0278    PFI0910w   


    PTHR13710:SF15 - PTHR13710:SF15 (Panther link)

    Proteins where this domain is known:
    PF14_0278   


    PTHR13710:SF5 - PTHR13710:SF5 (Panther link)

    Proteins where this domain is known:
    PFI0910w   


    PTHR13711 - PTHR13711 (Panther link)

    Proteins where this domain is known:
    MAL13P1.290    MAL8P1.72    PF14_0393    PFL0145c   


    PTHR13711:SF2 - PTHR13711:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0393   


    PTHR13712:SF18 - PTHR13712:SF18 (Panther link)

    Proteins where this domain is known:
    PF11_0429   


    PTHR13718 - Ribosomal_S5 (Panther link)

    Interpro entry IPR000851 : Ribosomal protein S5 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S5 is one of the proteins from the small ribosomal subunit, and is a protein of 166 to 254 amino-acid residues. In Escherichia coli, S5 is known to be important in the assembly and function of the 30S ribosomal subunit. Mutations in S5 have been shown to increase translational error frequencies. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, cyanelle, red algal chloroplast, archaeal and fungal mitochondrial S5; mammalian, Caenorhabditis elegans, Drosophila and plant S2; and yeast S4 (SUP44).

    Proteins where this domain is known:
    PF14_0448   


    PTHR13718:SF4 - PTHR13718:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0448   


    PTHR13720 - PTHR13720 (Panther link)

    Proteins where this domain is known:
    PFB0755w   


    PTHR13734 - PTHR13734 (Panther link)

    Proteins where this domain is known:
    PF11_0212   


    PTHR13734:SF10 - PTHR13734:SF10 (Panther link)

    Proteins where this domain is known:
    PF11_0212   


    PTHR13743 - PTHR13743 (Panther link)

    Proteins where this domain is known:
    PF11_0252   


    PTHR13748 - PTHR13748 (Panther link)

    Proteins where this domain is known:
    PF14_0052   


    PTHR13748:SF7 - PTHR13748:SF7 (Panther link)

    Proteins where this domain is known:
    PF14_0052   


    PTHR13763 - BRCA1 (Panther link)

    Interpro entry IPR011364 : BRCA1 (Interpro link)

    Interpro description:

    This group represents a DNA-damage repair protein, BRCA1. Germline mutations of the tumour-suppressor gene product BRCA1 lead to 50% familial breast cancer. The protein contains the BRCT C-terminal domain, an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.

    Proteins where this domain is known:
    PF10_0117   


    PTHR13767 - PTHR13767 (Panther link)

    Proteins where this domain is known:
    PF10_0175   


    PTHR13768 - NSF_attach (Panther link)

    Interpro entry IPR000744 : NSF attachment protein (Interpro link)

    Interpro description:

    Regulated exocytosis of neurotransmitters and hormones, as well as intracellular traffic, requires fusion of two lipid bilayers. SNARE proteins are thought to form a protein bridge, the SNARE complex, between an incoming vesicle and the acceptor compartment. SNARE proteins contribute to the specificity of membrane fusion, implying that the mechanisms by which SNAREs are targeted to subcellular compartments are important for specific docking and fusion of vesicles. This mechanism involves a family of conserved proteins, members of which appear to function at all sites of constitutive and regulated secretion in eukaryotes. Among them are 2 types of cytosolic protein, NSF (N-ethyl-maleimide-sensitive protein) and the SNAPs (alpha-, beta- and gamma-soluble NSF attachment proteins). The yeast vesicular fusion protein, sec17, a cytoplasmic peripheral membrane protein involved in vesicular transport between the endoplasmic reticulum and the golgi apparatus, shows a high degree of sequence similarity to the alpha-SNAP family.

    SNAP-25 and its non-neuronal homologue Syndet/SNAP-23 are synthesized as soluble proteins in the cytosol. Both SNAP-25 and Syndet/SNAP-23 are palmitoylated at cysteine residues clustered in a loop between two N- and C-terminal coils and palmitoylation is essential for membrane binding and plasma membrane targeting. The C-terminal and the N-terminal helices of SNAP-25, are each targeted to the plasma membrane by two distinct cysteine-rich domains and appear to regulate the availability of SNAP to form complexes with SNARE.

    Proteins where this domain is known:
    PFE0445c   


    PTHR13768:SF4 - PTHR13768:SF4 (Panther link)

    Proteins where this domain is known:
    PFE0445c   


    PTHR13773 - PTHR13773 (Panther link)

    Proteins where this domain is known:
    PF14_0097   


    PTHR13779 - PTHR13779 (Panther link)

    Proteins where this domain is known:
    PF11_0131   


    PTHR13779:SF1 - PTHR13779:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0131   


    PTHR13793 - PTHR13793 (Panther link)

    Proteins where this domain is known:
    PF14_0724   


    PTHR13793:SF17 - PTHR13793:SF17 (Panther link)

    Proteins where this domain is known:
    PF14_0724   


    PTHR13803 - PTHR13803 (Panther link)

    Proteins where this domain is known:
    PF13_0324    PFD0250c   


    PTHR13822 - ATPase_F1_d/e (Panther link)

    Interpro entry IPR001469 : ATPase, F1 complex, delta/epsilon subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF11_0485   


    PTHR13824 - NALP (NACHT, LEUCINE RICH REPEAT AND PYRIN DOMAIN CONTAINING)-RELATED (Panther link)

    Proteins where this domain is known:
    PF14_0021    PFL2380c   


    PTHR13824:SF9 - LEUCINE RICH REPEAT-CONTAINING (Panther link)

    Proteins where this domain is known:
    PFL2380c   


    PTHR13829 - PTHR13829 (Panther link)

    Proteins where this domain is known:
    PFE1020w   


    PTHR13829:SF2 - PTHR13829:SF2 (Panther link)

    Proteins where this domain is known:
    PFE1020w   


    PTHR13831 - PTHR13831 (Panther link)

    Proteins where this domain is known:
    PFE0090w   


    PTHR13832 - PP2C (Panther link)

    Interpro entry IPR015655 : (Interpro link)

    Interpro description:

    Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 kDa, that shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. The exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1 that have weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (Mouse-ear cress) (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and contains a C-terminal PP2C domain.

    PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase (PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.

    Proteins where this domain is known:
    MAL13P1.44    MAL8P1.108    MAL8P1.109    PF11_0362    PF11_0396    PF14_0523    PFD0505c    PFE1010w    PFL2365w   


    PTHR13832:SF11 - PTHR13832:SF11 (Panther link)

    Proteins where this domain is known:
    PF11_0362   


    PTHR13832:SF24 - PTHR13832:SF24 (Panther link)

    Proteins where this domain is known:
    PFL2365w   


    PTHR13832:SF89 - PTHR13832:SF89 (Panther link)

    Proteins where this domain is known:
    MAL8P1.108   


    PTHR13832:SF91 - PTHR13832:SF91 (Panther link)

    Proteins where this domain is known:
    PF11_0396   


    PTHR13832:SF92 - PTHR13832:SF92 (Panther link)

    Proteins where this domain is known:
    PFD0505c   


    PTHR13832:SF94 - PTHR13832:SF94 (Panther link)

    Proteins where this domain is known:
    MAL13P1.44    MAL8P1.109   


    PTHR13832:SF95 - PTHR13832:SF95 (Panther link)

    Proteins where this domain is known:
    PF14_0523   


    PTHR13844 - PTHR13844 (Panther link)

    Proteins where this domain is known:
    PFF0560c   


    PTHR13847 - PTHR13847 (Panther link)

    Proteins where this domain is known:
    MAL13P1.390    PF13_0345    PFF0815w    PFI1255w   


    PTHR13847:SF4 - FAD OXIDOREDUCTASE (Panther link)

    Proteins where this domain is known:
    MAL13P1.390   


    PTHR13847:SF5 - PTHR13847:SF5 (Panther link)

    Proteins where this domain is known:
    PF13_0345   


    PTHR13847:SF8 - PTHR13847:SF8 (Panther link)

    Proteins where this domain is known:
    PFF0815w   


    PTHR13861 - ATPase_V1_F_euk (Panther link)

    Interpro entry IPR005772 : ATPase, V1 complex, subunit F, eukaryotic (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    V-ATPases (also known as V1V0-ATPase or vacuolar ATPase) are found in the eukaryotic endomembrane system, and in the plasma membrane of prokaryotes and certain specialised eukaryotic cells. V-ATPases hydrolyse ATP to drive a proton pump, and are involved in a variety of vital intra- and inter-cellular processes such as receptor mediated endocytosis, protein trafficking, active transport of metabolites, homeostasis and neurotransmitter release. V-ATPases are composed of two linked complexes: the V1 complex (subunits A-H) contains the catalytic core that hydrolyses ATP, while the V0 complex (subunits a, c, c', c'', d) forms the membrane-spanning pore. V-ATPases may have an additional role in membrane fusion through binding to t-SNARE proteins.

    This entry represents subunit F found in the V1 complex of V-ATPases in eukaryotes. Subunit F is a 16 kDa protein that is required for the assembly and activity of V-ATPase, and has a potential role in the differential targeting and regulation of the enzyme for specific organelles. This subunit is not necessary for the rotation of the ATPase V1 rotor, but it does promote catalysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF11_0412   


    PTHR13871 - PTHR13871 (Panther link)

    Proteins where this domain is known:
    PF14_0186    PFC0166w    PFI0945w   


    PTHR13871:SF1 - PTHR13871:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0945w   


    PTHR13871:SF4 - PTHR13871:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0186   


    PTHR13872 - PTHR13872 (Panther link)

    Proteins where this domain is known:
    PF11_0173    PF11_0260   


    PTHR13872:SF1 - PTHR13872:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0173   


    PTHR13872:SF2 - 60S RIBOSOMAL PROTEIN L35 (Panther link)

    Proteins where this domain is known:
    PF11_0260   


    PTHR13889 - PTHR13889 (Panther link)

    Proteins where this domain is known:
    PFC0365w   


    PTHR13889:SF6 - PTHR13889:SF6 (Panther link)

    Proteins where this domain is known:
    PFC0365w   


    PTHR13890 - PTHR13890 (Panther link)

    Proteins where this domain is known:
    MAL13P1.23    PF11_0210    PF14_0255   


    PTHR13904 - PTHR13904 (Panther link)

    Proteins where this domain is known:
    PFD0450c   


    PTHR13923 - PTHR13923 (Panther link)

    Proteins where this domain is known:
    PFB0640c   


    PTHR13923:SF2 - PTHR13923:SF2 (Panther link)

    Proteins where this domain is known:
    PFB0640c   


    PTHR13930 - PTHR13930 (Panther link)

    Proteins where this domain is known:
    PFE1240w   


    PTHR13931 - UBIQUITINATION FACTOR E4 (Panther link)

    Proteins where this domain is known:
    PF08_0020    PFL1750c   


    PTHR13931:SF1 - gb def: Hypothetical protein (Panther link)

    Proteins where this domain is known:
    PF08_0020    PFL1750c   


    PTHR13937 - PTHR13937 (Panther link)

    Proteins where this domain is known:
    PFL0310c   


    PTHR13946 - PTHR13946 (Panther link)

    Proteins where this domain is known:
    PF13_0023    PF14_0150   


    PTHR13946:SF16 - PTHR13946:SF16 (Panther link)

    Proteins where this domain is known:
    PF13_0023   


    PTHR13946:SF17 - PTHR13946:SF17 (Panther link)

    Proteins where this domain is known:
    PF14_0150   


    PTHR13948 - PTHR13948 (Panther link)

    Proteins where this domain is known:
    PF13_0278   


    PTHR13950 - PTHR13950 (Panther link)

    Proteins where this domain is known:
    MAL8P1.139   


    PTHR13952 - PTHR13952 (Panther link)

    Proteins where this domain is known:
    MAL13P1.338   


    PTHR13976 - PTHR13976 (Panther link)

    Proteins where this domain is known:
    PF10_0235   


    PTHR13980 - PTHR13980 (Panther link)

    Proteins where this domain is known:
    PFE0870w   


    PTHR13980:SF2 - PTHR13980:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0870w   


    PTHR13982 - PTHR13982 (Panther link)

    Proteins where this domain is known:
    PFB0440c   


    PTHR14005 - PTHR14005 (Panther link)

    Proteins where this domain is known:
    PFL0625c   


    PTHR14009 - PTHR14009 (Panther link)

    Proteins where this domain is known:
    PFD0835c   


    PTHR14009:SF1 - PTHR14009:SF1 (Panther link)

    Proteins where this domain is known:
    PFD0835c   


    PTHR14017 - PTHR14017 (Panther link)

    Proteins where this domain is known:
    PF11_0127   


    PTHR14017:SF1 - PTHR14017:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0127   


    PTHR14042 - PTHR14042 (Panther link)

    Proteins where this domain is known:
    MAL13P1.123   


    PTHR14052 - ORC2 (Panther link)

    Interpro entry IPR007220 : Origin recognition complex subunit 2 (Interpro link)

    Interpro description:

    All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed the origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. This entry is subunit 2, which binds the origin of replication. It plays a role in chromosome replication and mating type transcriptional silencing.

    Proteins where this domain is known:
    MAL7P1.21   


    PTHR14068 - eIF3b (Panther link)

    Interpro entry IPR011400 : Translation initiation factor eIF-3b (Interpro link)

    Interpro description:

    This group represents a translation initiation factor eIF-3b, which binds to the 40S ribosome and promotes the binding of methionyl-tRNAi and mRNA. eIF-3 is composed of at least 12 different subunits.

    Proteins where this domain is known:
    PFE0885w   


    PTHR14085 - PTHR14085 (Panther link)

    Proteins where this domain is known:
    PF07_0092   


    PTHR14089 - PTHR14089 (Panther link)

    Proteins where this domain is known:
    PFE0750c    PFF1435w    PFL2310w   


    PTHR14089:SF2 - PTHR14089:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0750c   


    PTHR14089:SF4 - PTHR14089:SF4 (Panther link)

    Proteins where this domain is known:
    PFF1435w    PFL2310w   


    PTHR14091 - PTHR14091 (Panther link)

    Proteins where this domain is known:
    PFL1820w   


    PTHR14094 - PTHR14094 (Panther link)

    Proteins where this domain is known:
    PF11_0375   


    PTHR14094:SF9 - PTHR14094:SF9 (Panther link)

    Proteins where this domain is known:
    PF11_0375   


    PTHR14095 - PTHR14095 (Panther link)

    Proteins where this domain is known:
    PF13_0302   


    PTHR14110 - PTHR14110 (Panther link)

    Proteins where this domain is known:
    PFF1330c   


    PTHR14145 - PTHR14145 (Panther link)

    Proteins where this domain is known:
    PF11_0303   


    PTHR14145:SF1 - PTHR14145:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0303   


    PTHR14152 - SART_1 (Panther link)

    Interpro entry IPR005011 : (Interpro link)

    Interpro description:
    This family of proteins appear to contain a leucine zipper and may therefore be a family of transcription factors.

    Proteins where this domain is known:
    PFC1060c   


    PTHR14154 - PTHR14154 (Panther link)

    Proteins where this domain is known:
    PF13_0223    PF14_0671   


    PTHR14154:SF2 - PTHR14154:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0671   


    PTHR14154:SF3 - PTHR14154:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0223   


    PTHR14159 - PTHR14159 (Panther link)

    Proteins where this domain is known:
    PFL1295w   


    PTHR14190 - Vps52 (Panther link)

    Interpro entry IPR007258 : (Interpro link)

    Interpro description:
    Vps52 complexes with Vps53 and Vps54 to form a multi-subunit complex involved in regulating membrane trafficking events.

    Proteins where this domain is known:
    PF13_0135   


    PTHR14190:SF1 - PTHR14190:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0135   


    PTHR14205 - PTHR14205 (Panther link)

    Proteins where this domain is known:
    PFL1040w   


    PTHR14212 - PTHR14212 (Panther link)

    Proteins where this domain is known:
    MAL13P1.45   


    PTHR14222 - PTHR14222 (Panther link)

    Proteins where this domain is known:
    PF11_0368    PF14_0031   


    PTHR14359 - PTHR14359 (Panther link)

    Proteins where this domain is known:
    MAL8P1.81   


    PTHR14359:SF10 - PTHR14359:SF10 (Panther link)

    Proteins where this domain is known:
    MAL8P1.81   


    PTHR14413 - Ribosomal_L17 (Panther link)

    Interpro entry IPR000456 : Ribosomal protein L17 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L17 is one of the proteins from the large ribosomal subunit. Bacterial L17 is a protein of 120 to 130 amino-acid residues while yeast YmL8 is twice as large (238 residues). The N-terminal half of YmL8 is colinear with the sequence of L17 from Escherichia coli.

    Proteins where this domain is known:
    PF14_0289    PFE1125w   


    PTHR14413:SF2 - PTHR14413:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0289   


    PTHR14413:SF3 - PTHR14413:SF3 (Panther link)

    Proteins where this domain is known:
    PFE1125w   


    PTHR14467 - Arv1 (Panther link)

    Interpro entry IPR007290 : (Interpro link)

    Interpro description:

    Arv1 is a transmembrane protein, with potential zinc-binding motifs, that mediates sterol homeostasis. Its action is important in lipid homeostasis, which prevents free sterol toxicity. Arv1 contains a homology domain (AHD), which consists of an N-terminal cysteine-rich subdomain with a putative zinc-binding motif, followed by a C-terminal subdomain of 33 amino acids. The C-terminal subdomain of the AHD is critical for the protein's function. In yeast, Arv1p is important for the delivery of an early glycosylphosphatidylinositol GPI intermediate, GlcN-acylPI, to the first mannosyltransferase of GPI synthesis in the ER lumen. It is important for the traffic of sterol in yeast and in humans. In eukaryotic cells, it may fuction in the sphingolipid metabolic pathway as a transporter of ceramides between the ER and Golgi.

    Proteins where this domain is known:
    PF10_0110   


    PTHR14490 - Krr1 (Panther link)

    Interpro entry IPR018034 : (Interpro link)

    Interpro description:
    The Kri1 protein is also known as KRR1-interacting protein 1. The Saccharomyces cerevisiae member of this family is found to be required for the assembly of preribosomal 40S subunits in the nucleolus. KRR1 is highly expressed in dividing cells and its expression ceases almost completely when cells enter the stationary phase.

    Proteins where this domain is known:
    PF08_0026   


    PTHR14499 - PTHR14499 (Panther link)

    Proteins where this domain is known:
    PFL1875w   


    PTHR14604 - PTHR14604 (Panther link)

    Proteins where this domain is known:
    PFE0540w   


    PTHR14624 - DFG10 PROTEIN (Panther link)

    Proteins where this domain is known:
    PF14_0791   


    PTHR14677 - PTHR14677 (Panther link)

    Proteins where this domain is known:
    PFE0200c   


    PTHR14738 - PTHR14738 (Panther link)

    Proteins where this domain is known:
    PFF1110c   


    PTHR14738:SF1 - PTHR14738:SF1 (Panther link)

    Proteins where this domain is known:
    PFF1110c   


    PTHR14741 - PTHR14741 (Panther link)

    Proteins where this domain is known:
    PFL0125c   


    PTHR14741:SF5 - PTHR14741:SF5 (Panther link)

    Proteins where this domain is known:
    PFL0125c   


    PTHR14742 - Rpr2 (Panther link)

    Interpro entry IPR007175 : (Interpro link)

    Interpro description:
    This family contains a ribonuclease P subunit of human and yeast. Other members of the family include the probable archaeal homologues. This subunit possibly binds the precursor tRNA.

    Proteins where this domain is known:
    MAL13P1.153   


    PTHR14927 - PTHR14927 (Panther link)

    Proteins where this domain is known:
    PF08_0135   


    PTHR14978 - PTHR14978 (Panther link)

    Proteins where this domain is known:
    PF11_0397   


    PTHR14986 - Urm1 (Panther link)

    Interpro entry IPR015221 : (Interpro link)

    Interpro description:

    Ubiquitin related modifier 1 (Urm1) is a ubiquitin related protein that modifies proteins in the yeast ubiquitin-like urmylation pathway. Structural comparisons and phylogenetic analysis of the ubiquitin superfamily has indicated that Urm1 has the most conserved structural and sequence features of the common ancestor of the entire superfamily.

    Proteins where this domain is known:
    PF11_0393   


    PTHR15081 - PTHR15081 (Panther link)

    Proteins where this domain is known:
    PFL0280c   


    PTHR15081:SF1 - PTHR15081:SF1 (Panther link)

    Proteins where this domain is known:
    PFL0280c   


    PTHR15092 - PTHR15092 (Panther link)

    Proteins where this domain is known:
    PF14_0413   


    PTHR15092:SF3 - PTHR15092:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0413   


    PTHR15111 - PTHR15111 (Panther link)

    Proteins where this domain is known:
    PFI0350c   


    PTHR15184 - PTHR15184 (Panther link)

    Proteins where this domain is known:
    PF13_0065    PFB0795w    PFD0305c    PFL1725w   


    PTHR15184:SF11 - V-TYPE ATP SYNTHASE BETA CHAIN (Panther link)

    Proteins where this domain is known:
    PFD0305c   


    PTHR15184:SF3 - ATPase_F1_a (Panther link)

    Interpro entry IPR017458 : ATPase, F1 complex, alpha subunit, C-terminal (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This entry represents the C-terminal region of the alpha subunit in the F1 complex of F-ATPases. In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha-subunit contains a highly conserved adenine-specific non-catalytic nucleotide-binding domain, with a conserved amino acid sequence of Gly-X-X-X-X-Gly-Lys. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.

    More information about these proteins can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PFB0795w   


    PTHR15184:SF7 - V-TYPE ATP SYNTHASE ALPHA CHAIN (Panther link)

    Proteins where this domain is known:
    PF13_0065   


    PTHR15184:SF8 - ATPase_F1_b (Panther link)

    Interpro entry IPR005722 : ATPase, F1 complex, beta subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This entry represents the beta subunit found in the F1 complex of F-ATPases. In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PFL1725w   


    PTHR15231 - PTHR15231 (Panther link)

    Proteins where this domain is known:
    PF11_0425   


    PTHR15239 - PTHR15239 (Panther link)

    Proteins where this domain is known:
    PFL0130c   


    PTHR15241 - PTHR15241 (Panther link)

    Proteins where this domain is known:
    PF10_0028   


    PTHR15245 - PTHR15245 (Panther link)

    Proteins where this domain is known:
    PFC0475c   


    PTHR15245:SF11 - PTHR15245:SF11 (Panther link)

    Proteins where this domain is known:
    PFC0475c   


    PTHR15316 - PTHR15316 (Panther link)

    Proteins where this domain is known:
    PF14_0713   


    PTHR15323 - D123 (Panther link)

    Interpro entry IPR009772 : (Interpro link)

    Interpro description:

    This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate.

    Proteins where this domain is known:
    PFC1000w   


    PTHR15327 - MFAP1_C (Panther link)

    Interpro entry IPR009730 : Micro-fibrillar-associated 1, C-terminal (Interpro link)

    Interpro description:

    This entry represents the C terminus (approximately 300 residues) of eukaryotic micro-fibrillar-associated protein 1, which is a component of elastin-associated microfibrils in the extracellular matrix.

    Proteins where this domain is known:
    MAL13P1.132   


    PTHR15346 - PTHR15346 (Panther link)

    Proteins where this domain is known:
    MAL13P1.230   


    PTHR15350 - PTHR15350 (Panther link)

    Proteins where this domain is known:
    PFD0880w   


    PTHR15350:SF1 - PTHR15350:SF1 (Panther link)

    Proteins where this domain is known:
    PFD0880w   


    PTHR15362 - PTHR15362 (Panther link)

    Proteins where this domain is known:
    MAL13P1.82   


    PTHR15371 - PTHR15371 (Panther link)

    Proteins where this domain is known:
    PF13_0300   


    PTHR15457 - PTHR15457 (Panther link)

    Proteins where this domain is known:
    MAL8P1.91   


    PTHR15457:SF1 - PTHR15457:SF1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.91   


    PTHR15481 - PTHR15481 (Panther link)

    Proteins where this domain is known:
    PF11_0320   


    PTHR15574 - PTHR15574 (Panther link)

    Proteins where this domain is known:
    PF14_0262    PF14_0263   


    PTHR15574:SF1 - PTHR15574:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0262   


    PTHR15574:SF6 - PTHR15574:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0263   


    PTHR15588 - PTHR15588 (Panther link)

    Proteins where this domain is known:
    MAL8P1.9    PF11_0255   


    PTHR15601 - RAMP4 (Panther link)

    Interpro entry IPR010580 : (Interpro link)

    Interpro description:

    This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER.

    Proteins where this domain is known:
    PFB0888w   


    PTHR15606 - PTHR15606 (Panther link)

    Proteins where this domain is known:
    PF13_0036   


    PTHR15606:SF1 - PTHR15606:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0036   


    PTHR15608 - SPLICING FACTOR U2AF-ASSOCIATED PROTEIN 2 (Panther link)

    Proteins where this domain is known:
    MAL7P1.157a   


    PTHR15680 - Ribosomal_L19 (Panther link)

    Interpro entry IPR001857 : Ribosomal protein L19 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L19 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L19 is known to be located at the 30S-50S ribosomal subunit interface and may play a role in the structure and function of the aminoacyl-tRNA binding site. It belongs to a family of ribosomal proteins, including L19 from bacteria and the chloroplasts of red algae.

    L19 is a protein of 120 to 130 amino-acid residues.

    Proteins where this domain is known:
    PFF0495w   


    PTHR15680:SF2 - PTHR15680:SF2 (Panther link)

    Proteins where this domain is known:
    PFF0495w   


    PTHR15840 - PTHR15840 (Panther link)

    Proteins where this domain is known:
    PFE0580w   


    PTHR15893 - Ribosomal_L27 (Panther link)

    Interpro entry IPR001684 : Ribosomal protein L27 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.

    Proteins where this domain is known:
    PF10_0332    PFC0701w   


    PTHR15913 - PTHR15913 (Panther link)

    Proteins where this domain is known:
    PF14_0395   


    PTHR15938 - TBPIP (Panther link)

    Interpro entry IPR010776 : (Interpro link)

    Interpro description:

    This family consists of several eukaryotic TBP-1 interacting protein (TBPIP) sequences. TBP-1 has been demonstrated to interact with the human immunodeficiency virus type 1 (HIV-1) viral protein Tat, then modulate the essential replication process of HIV. In addition, TBP-1 has been shown to be a component of the 26S proteasome, a basic multiprotein complex that degrades ubiquitinated proteins in an ATP-dependent fashion. Human TBPIP interacts with human TBP-1 then modulates the inhibitory action of human TBP-1 on HIV-Tat-mediated transactivation.

    Proteins where this domain is known:
    PFL0325w   


    PTHR15948 - GPCR89-rel (Panther link)

    Interpro entry IPR015672 : (Interpro link)

    Interpro description:

    These probable G-protein coupled receptors were identified as parts of large genome screens. Members of this group are from insects, invertebrates and vertebrates, however the function that they may serve is unknown.

    Proteins where this domain is known:
    PF10_0082    PF14_0199   


    PTHR15952 - PTHR15952 (Panther link)

    Proteins where this domain is known:
    MAL13P1.83   


    PTHR15952:SF11 - PTHR15952:SF11 (Panther link)

    Proteins where this domain is known:
    MAL13P1.83   


    PTHR15954 - PTHR15954 (Panther link)

    Proteins where this domain is known:
    PFA0155c   


    PTHR15955 - PTHR15955 (Panther link)

    Proteins where this domain is known:
    PF13_0297   


    PTHR15959 - PTHR15959 (Panther link)

    Proteins where this domain is known:
    MAL13P1.365   


    PTHR15967 - PTHR15967 (Panther link)

    Proteins where this domain is known:
    PFB0826c   


    PTHR16012 - PTHR16012 (Panther link)

    Proteins where this domain is known:
    MAL8P1.132    PF07_0104    PF11_0478    PFA0535c    PFC0770c    PFC0860w    PFL0545w    PFL2165w    PFL2190c   


    PTHR16012:SF110 - PTHR16012:SF110 (Panther link)

    Proteins where this domain is known:
    PFA0535c    PFC0860w   


    PTHR16012:SF150 - PTHR16012:SF150 (Panther link)

    Proteins where this domain is known:
    PFC0770c   


    PTHR16012:SF35 - PTHR16012:SF35 (Panther link)

    Proteins where this domain is known:
    MAL8P1.132   


    PTHR16012:SF78 - PTHR16012:SF78 (Panther link)

    Proteins where this domain is known:
    PFL2165w   


    PTHR16017 - PTHR16017 (Panther link)

    Proteins where this domain is known:
    PF10_0326   


    PTHR16019 - PTHR16019 (Panther link)

    Proteins where this domain is known:
    PFD1095w   


    PTHR16023 - PTHR16023 (Panther link)

    Proteins where this domain is known:
    PFL1240c   


    PTHR16189 - PTHR16189 (Panther link)

    Proteins where this domain is known:
    PFL0420w   


    PTHR16193 - PTHR16193 (Panther link)

    Proteins where this domain is known:
    MAL13P1.52   


    PTHR16255 - DUF155 (Panther link)

    Interpro entry IPR003734 : (Interpro link)

    Interpro description:

    This entry describes proteins of unknown function.

    Proteins where this domain is known:
    MAL8P1.63    PF10_0181    PF10_0318   


    PTHR16305 - TESTICULAR SOLUBLE ADENYLYL CYCLASE (Panther link)

    Proteins where this domain is known:
    MAL8P1.150   


    PTHR16305:SF7 - ADENYLATE CYCLASE (Panther link)

    Proteins where this domain is known:
    MAL8P1.150   


    PTHR16719 - COX17 (Panther link)

    Interpro entry IPR007745 : Cytochrome c oxidase copper chaperone (Interpro link)

    Interpro description:
    Cox17p is essential for the assembly of functional cytochrome c oxidase (CCO) and for delivery of copper ions to the mitochondrion for insertion into the enzyme in Saccharomyces cerevisiae.

    Proteins where this domain is known:
    PF10_0252   


    PTHR16742 - PTHR16742 (Panther link)

    Proteins where this domain is known:
    PFE0065w   


    PTHR16742:SF2 - PTHR16742:SF2 (Panther link)

    Proteins where this domain is known:
    PFE0065w   


    PTHR17204 - PTHR17204 (Panther link)

    Proteins where this domain is known:
    PFE1320w   


    PTHR17204:SF5 - PTHR17204:SF5 (Panther link)

    Proteins where this domain is known:
    PFE1320w   


    PTHR17408 - PTHR17408 (Panther link)

    Proteins where this domain is known:
    PFE0955w   


    PTHR17453 - SRP19 (Panther link)

    Interpro entry IPR002778 : Signal recognition particle, SRP19 subunit (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents the SRP19 subunit. The SRP19 protein is unstructured but forms a compact core domain and two extended RNA-binding loops upon binding the signal recognition particle (SRP) RNA.

    Proteins where this domain is known:
    PFL0785c   


    PTHR17490 - PTHR17490 (Panther link)

    Proteins where this domain is known:
    PFL0175c   


    PTHR17602 - Ribosom_reg (Panther link)

    Interpro entry IPR007023 : Ribosomal biogenesis regulatory protein (Interpro link)

    Interpro description:

    This is a family of eukaryotic ribosomal biogenesis regulatory proteins.

    Proteins where this domain is known:
    PF11_0259   


    PTHR17605 - PTHR17605 (Panther link)

    Proteins where this domain is known:
    PF14_0055   


    PTHR17920 - DUF726 (Panther link)

    Interpro entry IPR007941 : (Interpro link)

    Interpro description:

    This family consists of several uncharacterised eukaryotic proteins.

    Proteins where this domain is known:
    PFL0295c   


    PTHR18034 - PTHR18034 (Panther link)

    Proteins where this domain is known:
    PFL1855w   


    PTHR18034:SF3 - PTHR18034:SF3 (Panther link)

    Proteins where this domain is known:
    PFL1855w   


    PTHR18063 - DUF544 (Panther link)

    Interpro entry IPR007518 : (Interpro link)

    Interpro description:
    This is a eukaryotic protein of unknown function.

    Proteins where this domain is known:
    MAL13P1.37    PF13_0106   


    PTHR18359 - PTHR18359 (Panther link)

    Proteins where this domain is known:
    PFL1175w   


    PTHR18847 - PTHR18847 (Panther link)

    Proteins where this domain is known:
    PFD0750w   


    PTHR18860 - 14-3-3 (Panther link)

    Interpro entry IPR000308 : 14-3-3 protein (Interpro link)

    Interpro description:

    The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.

    14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

    Proteins where this domain is known:
    MAL13P1.309    MAL8P1.69   


    PTHR18866 - PTHR18866 (Panther link)

    Proteins where this domain is known:
    PF14_0664   


    PTHR18866:SF6 - PTHR18866:SF6 (Panther link)

    Proteins where this domain is known:
    PF14_0664   


    PTHR18867 - PTHR18867 (Panther link)

    Proteins where this domain is known:
    PFF0285c   


    PTHR18867:SF1 - PTHR18867:SF1 (Panther link)

    Proteins where this domain is known:
    PFF0285c   


    PTHR18895 - PTHR18895 (Panther link)

    Proteins where this domain is known:
    PF13_0016   


    PTHR18895:SF8 - PTHR18895:SF8 (Panther link)

    Proteins where this domain is known:
    PF13_0016   


    PTHR18916 - PTHR18916 (Panther link)

    Proteins where this domain is known:
    PFI0335w   


    PTHR18916:SF6 - PTHR18916:SF6 (Panther link)

    Proteins where this domain is known:
    PFI0335w   


    PTHR18919 - Thiolase (Panther link)

    Interpro entry IPR002155 : (Interpro link)

    Interpro description:

    Two different types of thiolase are found both in eukaryotes and in prokaryotes: acetoacetyl-CoA thiolase and 3-ketoacyl-CoA thiolase. 3-ketoacyl-CoA thiolase (also called thiolase I) has a broad chain-length specificity for its substrates and is involved in degradative pathways such as fatty acid beta-oxidation. Acetoacetyl-CoA thiolase (also called thiolase II) is specific for the thiolysis of acetoacetyl-CoA and involved in biosynthetic pathways such as poly beta-hydroxybutyrate synthesis or steroid biogenesis.

    In eukaryotes, there are two forms of 3-ketoacyl-CoA thiolase: one located in the mitochondrion and the other in peroxisomes.

    There are two conserved cysteine residues important for thiolase activity. The first located in the N-terminal section of the enzymes is involved in the formation of an acyl-enzyme intermediate; the second located at the C-terminal extremity is the active site base involved in deprotonation in the condensation reaction.

    Mammalian nonspecific lipid-transfer protein (nsL-TP) (also known as sterol carrier protein 2) is a protein which seems to exist in two different forms: a 14 Kd protein (SCP-2) and a larger 58 Kd protein (SCP-x). The former is found in the cytoplasm or the mitochondria and is involved in lipid transport; the latter is found in peroxisomes. The C-terminal part of SCP-x is identical to SCP-2 while the N-terminal portion is evolutionary related to thiolases.

    Proteins where this domain is known:
    PF14_0484   


    PTHR18919:SF2 - PTHR18919:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0484   


    PTHR18929 - PROTEIN DISULFIDE ISOMERASE (Panther link)

    Proteins where this domain is known:
    MAL8P1.17    PF11_0286    PF11_0352    PF13_0272    PF14_0694    PFI0950w   


    PTHR18934 - PTHR18934 (Panther link)

    Proteins where this domain is known:
    MAL13P1.14    MAL13P1.322    PF08_0042    PF10_0294    PF14_0720    PFC0440c    PFI0860c    PFL1525c   


    PTHR18937 - PTHR18937 (Panther link)

    Proteins where this domain is known:
    MAL13P1.96    PF11_0317    PFD0685c    PFE0450w   


    PTHR18937:SF11 - PTHR18937:SF11 (Panther link)

    Proteins where this domain is known:
    PF11_0317   


    PTHR18937:SF13 - PTHR18937:SF13 (Panther link)

    Proteins where this domain is known:
    PFE0450w   


    PTHR18937:SF14 - PTHR18937:SF14 (Panther link)

    Proteins where this domain is known:
    PFD0685c   


    PTHR18937:SF9 - PTHR18937:SF9 (Panther link)

    Proteins where this domain is known:
    MAL13P1.96   


    PTHR18952 - Euk_COanhd (Panther link)

    Interpro entry IPR001148 : (Interpro link)

    Interpro description:

    Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.

    This entry represents alpha class carbonic anhydrases.

    More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.

    Proteins where this domain is known:
    PF11_0411   


    PTHR18958 - PTHR18958 (Panther link)

    Proteins where this domain is known:
    MAL13P1.285    MAL13P1.71    MAL8P1.28    PF10_0102    PF10_0213    PF10_0328    PF11_0197    PF14_0106    PF14_0222    PFB0410c    PFE0400w    PFF1315w    PFL2200w   


    PTHR18958:SF113 - PTHR18958:SF113 (Panther link)

    Proteins where this domain is known:
    PF14_0222   


    PTHR18958:SF181 - PTHR18958:SF181 (Panther link)

    Proteins where this domain is known:
    PF10_0328   


    PTHR18958:SF198 - PTHR18958:SF198 (Panther link)

    Proteins where this domain is known:
    PF11_0197   


    PTHR18958:SF266 - PTHR18958:SF266 (Panther link)

    Proteins where this domain is known:
    PF14_0106   


    PTHR18958:SF28 - PTHR18958:SF28 (Panther link)

    Proteins where this domain is known:
    MAL13P1.285   


    PTHR18958:SF35 - PTHR18958:SF35 (Panther link)

    Proteins where this domain is known:
    PFB0410c   


    PTHR18958:SF77 - gb def: UNC-44 ankyrins (Panther link)

    Proteins where this domain is known:
    MAL13P1.71   


    PTHR19139 - MIP (Panther link)

    Interpro entry IPR000425 : Major intrinsic protein (Interpro link)

    Interpro description:

    A number of transmembrane (TM) channel proteins can be grouped together on the basis of sequence similarities.

    These include:

    MIP family proteins are thought to contain 6 TM domains. Sequence analysis suggests that the proteins may have arisen through tandem, intragenic duplication from an ancestral protein that contained 3 TM domains.

    Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates a ttached to lipids or proteins. Aquaporin-CHIP (Aquaporin 1) belo ngs to the Colton blood group system and is associated with Co(a/b) antigen.

    Proteins where this domain is known:
    PF11_0338   


    PTHR19139:SF8 - gb def: Aquaglyceroporin (Panther link)

    Proteins where this domain is known:
    PF11_0338   


    PTHR19211 - PTHR19211 (Panther link)

    Proteins where this domain is known:
    PF08_0078    PF11_0225   


    PTHR19211:SF12 - PTHR19211:SF12 (Panther link)

    Proteins where this domain is known:
    PF11_0225   


    PTHR19222 - PTHR19222 (Panther link)

    Proteins where this domain is known:
    PF14_0133    PF14_0321    PFC0875w   


    PTHR19222:SF22 - PTHR19222:SF22 (Panther link)

    Proteins where this domain is known:
    PF14_0321   


    PTHR19222:SF6 - SufC (Panther link)

    Interpro entry IPR010230 : ATPase SufC, SUF system FeS cluster assembly (Interpro link)

    Interpro description:

    Iron-sulphur (FeS) clusters are important cofactors for numerous proteins involved in electron transfer, in redox and non-redox catalysis, in gene regulation, and as sensors of oxygen and iron. These functions depend on the various FeS cluster prosthetic groups, the most common being [2Fe-2S] and [4Fe-4S]. FeS cluster assembly is a complex process involving the mobilisation of Fe and S atoms from storage sources, their assembly into [Fe-S] form, their transport to specific cellular locations, and their transfer to recipient apoproteins. So far, three FeS assembly machineries have been identified, which are capable of synthesising all types of [Fe-S] clusters: ISC (iron-sulphur cluster), SUF (sulphur assimilation), and NIF (nitrogen fixation) systems.

    The ISC system is conserved in eubacteria and eukaryotes (mitochondria), and has broad specificity, targeting general FeS proteins. It is encoded by the isc operon (iscRSUA-hscBA-fdx-iscX). IscS is a cysteine desulphurase, which obtains S from cysteine (converting it to alanine) and serves as a S donor for FeS cluster assembly. IscU and IscA act as scaffolds to accept S and Fe atoms, assembling clusters and transfering them to recipient apoproteins. HscA is a molecular chaperone and HscB is a co-chaperone. Fdx is a [2Fe-2S]-type ferredoxin. IscR is a transcription factor that regulates expression of the isc operon. IscX (also known as YfhJ) appears to interact with IscS and may function as an Fe donor during cluster assembly.

    The SUF system is an alternative pathway to the ISC system that operates under iron starvation and oxidative stress. It is found in eubacteria, archaea and eukaryotes (plastids). The SUF system is encoded by the suf operon (sufABCDSE), and the six encoded proteins are arranged into two complexes (SufSE and SufBCD) and one protein (SufA). SufS is a pyridoxal-phosphate (PLP) protein displaying cysteine desulphurase activity. SufE acts as a scaffold protein that accepts S from SufS and donates it to SufA. SufC is an ATPase with an unorthodox ATP-binding cassette (ABC)-like component. No specific functions have been assigned to SufB and SufD. SufA is homologous to IscA, acting as a scaffold protein in which Fe and S atoms are assembled into [FeS] cluster forms, which can then easily be transferred to apoproteins targets.

    In the NIF system, NifS and NifU are required for the formation of metalloclusters of nitrogenase in Azotobacter vinelandii, and other organisms, as well as in the maturation of other FeS proteins. Nitrogenase catalyses the fixation of nitrogen. It contains a complex cluster, the FeMo cofactor, which contains molybdenum, Fe and S. NifS is a cysteine desulphurase. NifU binds one Fe atom at its N-terminal, assembling an FeS cluster that is transferred to nitrogenase apoproteins. Nif proteins involved in the formation of FeS clusters can also be found in organisms that do not fix nitrogen.

    This entry represents SufC, which acts as an ATPase in the SUF system. SufC belongs to the ATP-binding cassette transporter family but is no longer thought to be part of a transporter. The complex is reported as cytosolic or associated with the membrane.

    Proteins where this domain is known:
    PF14_0133   


    PTHR19241 - PTHR19241 (Panther link)

    Proteins where this domain is known:
    PF14_0244   


    PTHR19241:SF16 - PTHR19241:SF16 (Panther link)

    Proteins where this domain is known:
    PF14_0244   


    PTHR19242 - PTHR19242 (Panther link)

    Proteins where this domain is known:
    PF11_0466    PF13_0218    PF13_0271    PF14_0455    PFA0245w    PFA0590w    PFC0125w    PFC0510w    PFE1150w    PFL0495c    PFL1410c   


    PTHR19242:SF52 - PTHR19242:SF52 (Panther link)

    Proteins where this domain is known:
    PF11_0466   


    PTHR19242:SF60 - PTHR19242:SF60 (Panther link)

    Proteins where this domain is known:
    PFL0495c   


    PTHR19242:SF67 - PTHR19242:SF67 (Panther link)

    Proteins where this domain is known:
    PF14_0455   


    PTHR19242:SF68 - PTHR19242:SF68 (Panther link)

    Proteins where this domain is known:
    PF13_0271   


    PTHR19242:SF72 - MULTIDRUG RESISTANCE PROTEIN-RELATED (Panther link)

    Proteins where this domain is known:
    PFE1150w   


    PTHR19242:SF98 - PTHR19242:SF98 (Panther link)

    Proteins where this domain is known:
    PFA0245w    PFC0510w   


    PTHR19248 - PTHR19248 (Panther link)

    Proteins where this domain is known:
    MAL13P1.344   


    PTHR19248:SF3 - PTHR19248:SF3 (Panther link)

    Proteins where this domain is known:
    MAL13P1.344   


    PTHR19288 - PTHR19288 (Panther link)

    Proteins where this domain is known:
    PF07_0059   


    PTHR19288:SF6 - PTHR19288:SF6 (Panther link)

    Proteins where this domain is known:
    PF07_0059   


    PTHR19302 - Spc97_Spc98 (Panther link)

    Interpro entry IPR007259 : Spc97/Spc98 (Interpro link)

    Interpro description:

    Members of this family are spindle pole body (SBP) components such as Spc97, Spc98 and gamma-tubulin. The SPB functions as the microtubule-organising centre in yeast, with the microtubule cytoskeleton playing an essential role in chromosome segregation, cellular organisation and vesicle trafficking in eukaryotic cells. In most cells, the centrosome is the primary microtubule-organising centre that nucleates and organises microtubules. Gamma-tubulin localises to centrosomes and is required for microtubule nucleation. In Saccharomyces cerevisiae, gamma-tubulin forms a stable complex with Spc97 and Spc98.

    Proteins where this domain is known:
    PF14_0414    PF14_0599    PFC0650w    PFI1620c   


    PTHR19302:SF10 - PTHR19302:SF10 (Panther link)

    Proteins where this domain is known:
    PF14_0414   


    PTHR19302:SF12 - PTHR19302:SF12 (Panther link)

    Proteins where this domain is known:
    PFI1620c   


    PTHR19302:SF7 - PTHR19302:SF7 (Panther link)

    Proteins where this domain is known:
    PFC0650w   


    PTHR19302:SF8 - PTHR19302:SF8 (Panther link)

    Proteins where this domain is known:
    PF14_0599   


    PTHR19305 - SYNAPTOSOMAL ASSOCIATED PROTEIN (Panther link)

    Proteins where this domain is known:
    MAL13P1.113   


    PTHR19306 - PTHR19306 (Panther link)

    Proteins where this domain is known:
    PF11_0249    PFE1255w   


    PTHR19306:SF1 - PTHR19306:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0249   


    PTHR19315 - DUF1077 (Panther link)

    Interpro entry IPR009445 : (Interpro link)

    Interpro description:

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PF14_0335   


    PTHR19316 - PTHR19316 (Panther link)

    Proteins where this domain is known:
    PFD0525w   


    PTHR19316:SF2 - PTHR19316:SF2 (Panther link)

    Proteins where this domain is known:
    PFD0525w   


    PTHR19326 - PTHR19326 (Panther link)

    Proteins where this domain is known:
    PFL2245w   


    PTHR19331 - PTHR19331 (Panther link)

    Proteins where this domain is known:
    PF14_0067   


    PTHR19331:SF16 - PTHR19331:SF16 (Panther link)

    Proteins where this domain is known:
    PF14_0067   


    PTHR19338 - PTHR19338 (Panther link)

    Proteins where this domain is known:
    PF14_0208   


    PTHR19355 - SERINE PROTEASE-RELATED (Panther link)

    Proteins where this domain is known:
    PFB0310c   


    PTHR19355:SF67 - COAGULATION FACTOR VIIB (Panther link)

    Proteins where this domain is known:
    PFB0310c   


    PTHR19359 - PTHR19359 (Panther link)

    Proteins where this domain is known:
    PF14_0266    PFL1555w   


    PTHR19370 - PTHR19370 (Panther link)

    Proteins where this domain is known:
    PF13_0353    PFI0885w   


    PTHR19370:SF1 - PTHR19370:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0885w   


    PTHR19370:SF3 - PTHR19370:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0353   


    PTHR19375 - Hsp70 (Panther link)

    Interpro entry IPR001023 : Heat shock protein Hsp70 (Interpro link)

    Interpro description:
    A family of heat shock proteins, the hsp70 proteins have an average molecular weight of 70 kDa. In most species, there are many proteins that belong to the hsp70 family. Some of these are only expressed under stress conditions (strictly inducible), while some are present in cells under normal growth conditions and are not heat-inducible (constitutive or cognate). Hsp70 proteins can be found in different cellular compartments (nuclear, cytosolic, mitochondrial, endoplasmic reticulum, etc.).

    Little is known of the function of hsp70 proteins. Some evidence suggests that the constitutive members have a role in the disassembly of clathrin cages, and may also participate in the post-translational transmembrane targetting of proteins to cellular organelles. No specific activities or associations have been found for the inducible members, although it has been suggested that they may accept incoming precursor proteins, keep them unfolded, then pass them on to the hsp60/hsp10 (cpn60/cpn10) complex for folding and assembly.

    Proteins where this domain is known:
    MAL13P1.540    MAL7P1.228    PF07_0033    PF08_0054    PF11_0351    PFI0875w   


    PTHR19375:SF1 - PTHR19375:SF1 (Panther link)

    Proteins where this domain is known:
    MAL7P1.228    PF08_0054    PF11_0351    PFI0875w   


    PTHR19375:SF10 - PTHR19375:SF10 (Panther link)

    Proteins where this domain is known:
    PF07_0033   


    PTHR19375:SF4 - gb def: Putative HSP protein (Panther link)

    Proteins where this domain is known:
    MAL13P1.540   


    PTHR19376 - PTHR19376 (Panther link)

    Proteins where this domain is known:
    PF13_0150    PFC0805w    PFE0465c   


    PTHR19376:SF12 - PTHR19376:SF12 (Panther link)

    Proteins where this domain is known:
    PFE0465c   


    PTHR19376:SF14 - PTHR19376:SF14 (Panther link)

    Proteins where this domain is known:
    PFC0805w   


    PTHR19376:SF15 - RNA_pol3 (Panther link)

    Interpro entry IPR015700 : DNA-directed RNA polymerase III largest subunit (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    This protein appears to be specific to the largest subunit of RNA polymerase III.

    Proteins where this domain is known:
    PF13_0150   


    PTHR19384 - PTHR19384 (Panther link)

    Proteins where this domain is known:
    PF14_0478    PFF1115w    PFI1140w   


    PTHR19384:SF15 - PTHR19384:SF15 (Panther link)

    Proteins where this domain is known:
    PFF1115w   


    PTHR19384:SF4 - PTHR19384:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0478   


    PTHR19384:SF6 - PTHR19384:SF6 (Panther link)

    Proteins where this domain is known:
    PFI1140w   


    PTHR19410 - ADH_short_C2 (Panther link)

    Interpro entry IPR002198 : Short-chain dehydrogenase/reductase SDR (Interpro link)

    Interpro description:
    The short-chain dehydrogenases/reductases family (SDR) is a very large family of enzymes, most of which are known to be NAD- or NADP-dependent oxidoreductases. As the first member of this family to be characterised was Drosophila alcohol dehydrogenase, this family used to be called 'insect-type', or 'short-chain' alcohol dehydrogenases. Most member of this family are proteins of about 250 to 300 amino acid residues. Most dehydrogenases possess at least 2 domains, the first binding the coenzyme, often NAD, and the second binding the substrate. This latter domain determines the substrate specificity and contains amino acids involved in catalysis. Little sequence similarity has been found in the coenzyme binding domain although there is a large degree of structural similarity, and it has therefore been suggested that the structure of dehydrogenases has arisen through gene fusion of a common ancestral coenzyme nucleotide sequence with various substrate specific domains.

    Proteins where this domain is known:
    PFD0465c    PFD0466c    PFD1035w    PFF0730c    PFF1265w    PFI1125c   


    PTHR19410:SF12 - Enoyl-ACP_rdct (Panther link)

    Interpro entry IPR014358 : Enoyl-[acyl-carrier-protein] reductase (NADH) (Interpro link)

    Interpro description:

    This entry contains enoyl-[acyl-carrier-protein] reductases. They are components of the type II (dissociable) fatty acid synthase system and catalyse the terminal reaction in the fatty acid elongation cycle.

    They belong to the short-chain dehydrogenases/reductases (SDR) domain superfamily and are therefore related to members of and others.

    Most SDRs contain two subdomains. The N-terminal subdomain binds the coenzyme, and the C-terminal subdomain binds the substrate, determines the substrate specificity and contains amino acids involved in catalysis. Despite low sequence similarity, all SDR structures display highly similar alpha/beta folding patterns with a central beta-sheet, typical of the Rossmann-fold .

    Crystal structures of these proteins have been extensively studied.

    (This information was partially derived from the PFAM database)

    Proteins where this domain is known:
    PFF0730c   


    PTHR19410:SF38 - PTHR19410:SF38 (Panther link)

    Proteins where this domain is known:
    PFD1035w   


    PTHR19410:SF40 - PTHR19410:SF40 (Panther link)

    Proteins where this domain is known:
    PFD0465c    PFD0466c   


    PTHR19410:SF87 - PTHR19410:SF87 (Panther link)

    Proteins where this domain is known:
    PFI1125c   


    PTHR19410:SF98 - PTHR19410:SF98 (Panther link)

    Proteins where this domain is known:
    PFF1265w   


    PTHR19411 - G10 (Panther link)

    Interpro entry IPR001748 : G10 protein (Interpro link)

    Interpro description:
    A Xenopus protein known as G10 has been found to be highly conserved in a wide range of eukaryotic species. The function of G10 is still unknown. G10 is a protein of about 17 to 18 kDa (143 to 157 residues) which is hydrophilic and whose C-terminal half is rich in cysteines and could be involved in metal-binding.

    Proteins where this domain is known:
    PFE1140c   


    PTHR19424 - HS1_bd (Panther link)

    Interpro entry IPR009643 : (Interpro link)

    Interpro description:

    Heat shock factor binding protein 1 (HSBP1) appears to be a negative regulator of the heat shock response.

    Proteins where this domain is known:
    PF11_0216   


    PTHR19431 - PTHR19431 (Panther link)

    Proteins where this domain is known:
    PFE0350c   


    PTHR19443 - Hexokinase (Panther link)

    Interpro entry IPR001312 : Hexokinase (Interpro link)

    Interpro description:

    Hexokinase is an important enzyme that catalyses the ATP-dependent conversion of aldo- and keto-hexose sugars to the hexose-6-phosphate (H6P). The enzyme can catalyse this reaction on glucose, fructose, sorbitol and glucosamine, and as such is the first step in a number of metabolic pathways. The addition of a phosphate group to the sugar acts to trap it in a cell, since the negatively charged phosphate cannot easily traverse the plasma membrane.

    The enzyme is widely distributed in eukaryotes. There are three isozymes of hexokinase in yeast (PI, PII and glucokinase): isozymes PI and PII phosphorylate both aldo- and keto-sugars; glucokinase is specific for aldo-hexoses. All three isozymes contain two domains. Structural studies of yeast hexokinase reveal a well-defined catalytic pocket that binds ATP and hexose, allowing easy transfer of the phosphate from ATP to the sugar. Vertebrates contain four hexokinase isozymes, designated I to IV, where types I to III contain a duplication of the two-domain yeast-type hexokinases. Both the N- and C-terminal halves bind hexose and H6P, though in types I an III only the C-terminal half supports catalysis, while both halves support catalysis in type II. The N-terminal half is the regulatory region. Type IV hexokinase is similar to the yeast enzyme in containing only the two domains, and is sometimes incorrectly referred to as glucokinase.

    The different vertebrate isozymes differ in their catalysis, localisation and regulation, thereby contributing to the different patterns of glucose metabolism in different tissues. Whereas types I to III can phosphorylate a variety of hexose sugars and are inhibited by glucose-6-phosphate (G6P), type IV is specific for glucose and shows no G6P inhibition. Type I enzyme may have a catabolic function, producing H6P for energy production in glycolysis; it is bound to the mitochondrial membrane, which enables the coordination of glycolysis with the TCA cycle. Types II and III enzyme may have anabolic functions, providing H6P for glycogen or lipid synthesis. Type IV enzyme is found in the liver and pancreatic beta-cells, where it is controlled by insulin (activation) and glucagon (inhibition). In pancreatic beta-cells, type IV enzyme acts as a glucose sensor to modify insulin secretion. Mutations in type IV hexokinase have been associated with diabetes mellitus.

    Proteins where this domain is known:
    PFF1155w   


    PTHR19836 - Ribosomal_S14 (Panther link)

    Interpro entry IPR001209 : Ribosomal protein S14 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    S14 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S14 is known to be required for the assembly of 30S particles and may also be responsible for determining the conformation of 16S rRNA at the A site. It belongs to a family of ribosomal proteins that include, bacterial, algal and plant chloroplast, yeast mitochondrial, cyanelle and archael, Methanococcus vannielii S14's, as well as yeast mitochondrial MRP2, yeast YS29A/B and mammalian S29.

    Proteins where this domain is known:
    PF11_0386    PF14_0451   


    PTHR19836:SF1 - PTHR19836:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0386   


    PTHR19836:SF3 - PTHR19836:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0451   


    PTHR19846 - WD40 REPEAT PROTEIN (Panther link)

    Proteins where this domain is known:
    MAL13P1.385   


    PTHR19848 - PTHR19848 (Panther link)

    Proteins where this domain is known:
    PF11_0471   


    PTHR19849 - PTHR19849 (Panther link)

    Proteins where this domain is known:
    PF13_0335   


    PTHR19852 - PTHR19852 (Panther link)

    Proteins where this domain is known:
    PFL0970w   


    PTHR19852:SF2 - PTHR19852:SF2 (Panther link)

    Proteins where this domain is known:
    PFL0970w   


    PTHR19853 - PTHR19853 (Panther link)

    Proteins where this domain is known:
    PF14_0456   


    PTHR19854 - PTHR19854 (Panther link)

    Proteins where this domain is known:
    PF10_0128   


    PTHR19855 - PTHR19855 (Panther link)

    Proteins where this domain is known:
    PFF1480w   


    PTHR19855:SF11 - PTHR19855:SF11 (Panther link)

    Proteins where this domain is known:
    PFF1480w   


    PTHR19858 - PTHR19858 (Panther link)

    Proteins where this domain is known:
    PF08_0130   


    PTHR19861 - PTHR19861 (Panther link)

    Proteins where this domain is known:
    MAL13P1.142    PF14_0087    PFL2105c   


    PTHR19865 - PTHR19865 (Panther link)

    Proteins where this domain is known:
    PFL1290w   


    PTHR19868 - PTHR19868 (Panther link)

    Proteins where this domain is known:
    PF08_0019   


    PTHR19876 - PTHR19876 (Panther link)

    Proteins where this domain is known:
    PFF0330w    PFI0290c   


    PTHR19876:SF1 - PTHR19876:SF1 (Panther link)

    Proteins where this domain is known:
    PFF0330w   


    PTHR19876:SF2 - COATOMER BETA SUBUNIT (Panther link)

    Proteins where this domain is known:
    PFI0290c   


    PTHR19877 - PTHR19877 (Panther link)

    Proteins where this domain is known:
    MAL7P1.81   


    PTHR19877:SF1 - PTHR19877:SF1 (Panther link)

    Proteins where this domain is known:
    MAL7P1.81   


    PTHR19879 - PTHR19879 (Panther link)

    Proteins where this domain is known:
    PF11_0400   


    PTHR19918 - PTHR19918 (Panther link)

    Proteins where this domain is known:
    PF10_0261   


    PTHR19920 - PTHR19920 (Panther link)

    Proteins where this domain is known:
    PFL0470w   


    PTHR19923 - PTHR19923 (Panther link)

    Proteins where this domain is known:
    PFC0100c   


    PTHR19957 - SYNTAXIN (Panther link)

    Proteins where this domain is known:
    MAL13P1.169    PFB0480w    PFL2070w   


    PTHR19957:SF3 - SYNTAXIN 5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.169   


    PTHR19957:SF5 - SYNTAXIN 16 (Panther link)

    Proteins where this domain is known:
    PFL2070w   


    PTHR19959 - PTHR19959 (Panther link)

    Proteins where this domain is known:
    PF13_0231   


    PTHR19959:SF22 - PTHR19959:SF22 (Panther link)

    Proteins where this domain is known:
    PF13_0231   


    PTHR19965 - PTHR19965 (Panther link)

    Proteins where this domain is known:
    PFF0760w   


    PTHR19965:SF8 - PTHR19965:SF8 (Panther link)

    Proteins where this domain is known:
    PFF0760w   


    PTHR19970 - Ribosomal_L39 (Panther link)

    Interpro entry IPR000077 : Ribosomal protein L39e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes.

    Proteins where this domain is known:
    PFF0573c   


    PTHR20426 - PTHR20426 (Panther link)

    Proteins where this domain is known:
    PFL0985c   


    PTHR20531 - PTHR20531 (Panther link)

    Proteins where this domain is known:
    PF13_0131   


    PTHR20661 - GWT1 (Panther link)

    Interpro entry IPR009447 : GWT1 (Interpro link)

    Interpro description:

    Glycosylphosphatidylinositol (GPI) is a conserved post-translational modification to anchor cell surface proteins to plasma membrane in eukaryotes. GWT1 is involved in GPI anchor biosynthesis; it is required for inositol acylation in yeast.

    Proteins where this domain is known:
    PFF0740c   


    PTHR20835 - PPI_Ypi1 (Panther link)

    Interpro entry IPR011107 : (Interpro link)

    Interpro description:

    These proteins include Ypi1, a novel Saccharomyces cerevisiae type 1 protein phosphatase inhibitor and ppp1r11/hcgv, annotated as having protein phosphatase inhibitor activity.

    Proteins where this domain is known:
    PF10_0311   


    PTHR20852 - PTHR20852 (Panther link)

    Proteins where this domain is known:
    PFI1110w   


    PTHR20852:SF7 - PTHR20852:SF7 (Panther link)

    Proteins where this domain is known:
    PFI1110w   


    PTHR20855 - HlyIII_related (Panther link)

    Interpro entry IPR004254 : Hly-III related (Interpro link)

    Interpro description:
    Members of this family are integral membrane proteins. This family includes proteins that are hemolysin-III homologs.

    Proteins where this domain is known:
    PF14_0528   


    PTHR20855:SF3 - HylIII (Panther link)

    Interpro entry IPR005744 : HylII (Interpro link)

    Interpro description:

    This family includes proteins from pathogenic and non-pathogenic bacteria, Homo sapiens (Human) and Drosophila melanogaster (Fruit fly). In Bacillus cereus, a pathogen, it has been show to function as a channel-forming cytolysin. The human protein is expressed preferentially in mature macrophages, consistent with a cytolytic role.

    Proteins where this domain is known:
    PF14_0528   


    PTHR20856 - RNA_pol_I_sub2 (Panther link)

    Interpro entry IPR015712 : DNA-directed RNA polymerase, subunit 2 (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    This protein appears to be specific to DNA-directed RNA polymerases, subunit 2.

    Proteins where this domain is known:
    PF11_0358    PFB0715w    PFL0330c   


    PTHR20856:SF5 - PTHR20856:SF5 (Panther link)

    Proteins where this domain is known:
    PF11_0358   


    PTHR20856:SF7 - PTHR20856:SF7 (Panther link)

    Proteins where this domain is known:
    PFB0715w   


    PTHR20856:SF8 - PTHR20856:SF8 (Panther link)

    Proteins where this domain is known:
    PFL0330c   


    PTHR20857 - PTHR20857 (Panther link)

    Proteins where this domain is known:
    PFF0680c    PFL1920c   


    PTHR20857:SF14 - Hyethyz_kinase (Panther link)

    Interpro entry IPR000417 : Hydroxyethylthiazole kinase (Interpro link)

    Interpro description:
    Thiamine pyrophosphate (TPP), a required cofactor for many enzymes in the cell, is synthesised de novo in Salmonella typhimurium. Five kinase activities have been implicated in TPP synthesis, which involves joining a 4-methyl-5-(beta-hydroxyethyl)thiazole (THZ) moiety and a 4-amino-5- hydroxymethyl-2-methylpyrimidine (HMP) moiety. THZ kinase activity is involved in the salvage synthesis of TH-P from the thiazole:
     2-methyl-4-amino-5-hydroxymethylpyrimidine diphosphate + 4-4-methyl-5-(2-phosphonooxyethyl)-thiazole = pyrophosphate + thiamin monophosphate 
    Hydroxyethylthiazole kinase expression is regulated at the mRNA level by intracellular thiamin pyrophosphate.

    Proteins where this domain is known:
    PFL1920c   


    PTHR20857:SF15 - PTHR20857:SF15 (Panther link)

    Proteins where this domain is known:
    PFF0680c   


    PTHR20858 - PTHR20858 (Panther link)

    Proteins where this domain is known:
    PFE1030c   


    PTHR20858:SF2 - PTHR20858:SF2 (Panther link)

    Proteins where this domain is known:
    PFE1030c   


    PTHR20861 - PTHR20861 (Panther link)

    Proteins where this domain is known:
    PFE0150c   


    PTHR20861:SF2 - IspE (Panther link)

    Interpro entry IPR004424 : 4-diphosphocytidyl-2C-methyl-D-erythritol kinase (Interpro link)

    Interpro description:
    4-diphosphocytidyl-2C-methyl-D-erythritol kinase is a member of the family of GHMP kinases that were previously designated as conserved hypothetical protein YchB or as isopentenyl monophosphate kinase. In Solanum lycopersicum (Tomato) (Lycopersicon esculentum) and Escherichia coli the protein has been indentified as 4-diphosphocytidyl-2C-methyl-D-erythritol kinase, an enzyme of the deoxyxylulose phosphate pathway of terpenoid biosynthesis.

    Proteins where this domain is known:
    PFE0150c   


    PTHR20863 - PTHR20863 (Panther link)

    Proteins where this domain is known:
    PF14_0612    PFB0385w    PFL0415w   


    PTHR20863:SF2 - PTHR20863:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0612   


    PTHR20863:SF5 - ACYL CARRIER PROTEIN (Panther link)

    Proteins where this domain is known:
    PFB0385w   


    PTHR20873 - PTHR20873 (Panther link)

    Proteins where this domain is known:
    PFA0185w   


    PTHR20882 - PTHR20882 (Panther link)

    Proteins where this domain is known:
    PF14_0389   


    PTHR20882:SF9 - PTHR20882:SF9 (Panther link)

    Proteins where this domain is known:
    PF14_0389   


    PTHR20902 - 41-2 PROTEIN ANTIGEN-RELATED (Panther link)

    Proteins where this domain is known:
    PF14_0358   


    PTHR20913 - PTHR20913 (Panther link)

    Proteins where this domain is known:
    PFC1030w   


    PTHR20913:SF7 - PTHR20913:SF7 (Panther link)

    Proteins where this domain is known:
    PFC1030w   


    PTHR20917 - DUF841_euk (Panther link)

    Interpro entry IPR008559 : (Interpro link)

    Interpro description:
    This family consists of several eukaryotic proteins with no known function.

    Proteins where this domain is known:
    PF13_0331   


    PTHR20921 - DUF778 (Panther link)

    Interpro entry IPR008496 : (Interpro link)

    Interpro description:
    This family consists of several eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PFE0240w   


    PTHR20922 - Znf_Zim17 (Panther link)

    Interpro entry IPR007853 : (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry prepresents the Zim17-type zinc finger motif thought to bind zinc. This domain is found in a number of eukaryotic proteins and is named after a short C-terminal motif of D(N/H)L. The domain is found in proteins having a novel zinc-finger essential for protein import into mitochondria.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF14_0197   


    PTHR20922:SF13 - PTHR20922:SF13 (Panther link)

    Proteins where this domain is known:
    PF14_0197   


    PTHR20934 - DUF701_Zn_bd (Panther link)

    Interpro entry IPR007808 : (Interpro link)

    Interpro description:
    This family of uncharacterised, mostly short, proteins contain a putative zinc binding domain with four conserved cysteines.

    Proteins where this domain is known:
    PFI0715w   


    PTHR20935 - PTHR20935 (Panther link)

    Proteins where this domain is known:
    PFD0660w   


    PTHR20941 - PTHR20941 (Panther link)

    Proteins where this domain is known:
    PF08_0095   


    PTHR20941:SF1 - DHP_synth (Panther link)

    Interpro entry IPR006390 : Dihydropteroate synthase (Interpro link)

    Interpro description:

    This domain is present in sequences representing dihydropteroate synthase, the enzyme that catalyzes the second to last step in folic acid biosynthesis.

    Dihydropteroate synthase (DHPS) catalyses the condensation of 6-hydroxymethyl-7,8-dihydropteridine pyrophosphate to para-aminobenzoic acid to form 7,8-dihydropteroate. This is the second step in the three-step pathway leading from 6-hydroxymethyl-7,8-dihydropterin to 7,8-dihydrofolate. DHPS is the target of sulphonamides, which are substrate analogues that compete with para-aminobenzoic acid. Bacterial DHPS (gene sul or folP) is a protein of about 275 to 315 amino acid residues that is either chromosomally encoded or found on various antibiotic resistance plasmids. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).

    Proteins where this domain is known:
    PF08_0095   


    PTHR20971 - PTHR20971 (Panther link)

    Proteins where this domain is known:
    PF14_0411   


    PTHR20978 - SF3b10 (Panther link)

    Interpro entry IPR009846 : (Interpro link)

    Interpro description:

    This family consists of several eukaryotic splicing factor 3B subunit 5 (SF3b5) proteins. SF3b5 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b5 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site. Also included in this entry is RDS3 complex subunit 10, another protein involved in mRNA splicing.

    Proteins where this domain is known:
    PF13_0296   


    PTHR20981 - Ribosomal_L21e (Panther link)

    Interpro entry IPR001147 : Ribosomal protein L21e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L21E family contains proteins from a number of eukaryotic and archaebacterial organisms which include; mammalian L2, Entamoeba histolytica L21, Caenorhabditis elegans L21 (C14B9.7), Saccharomyces cerevisiae (Baker's yeast) L21E (URP1) and Haloarcula marismortui HL31.

    Proteins where this domain is known:
    PF14_0240   


    PTHR20982 - Ribosome_recyc_fac (Panther link)

    Interpro entry IPR002661 : Ribosome recycling factor (Interpro link)

    Interpro description:

    The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.

    Proteins where this domain is known:
    PFB0390w   


    PTHR21043 - Iojap (Panther link)

    Interpro entry IPR004394 : (Interpro link)

    Interpro description:
    The gene iojap is a pattern-striping gene in maize, reflecting a chloroplast development defect in some cells. Maize has two RNA polymerases in plastids, but the plastid-encoded one, similar to bacterial RNA polymerases, is missing in iojap mutants. The role of iojap in chloroplast development, and the role of its bacterial orthologs modeled here, is unclear.

    Proteins where this domain is known:
    PF10_0198   


    PTHR21085 - Chorismate_synth (Panther link)

    Interpro entry IPR000453 : Chorismate synthase (Interpro link)

    Interpro description:
    Chorismate synthase catalyzes the last of the seven steps in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows a high degree of sequence conservation. It is a protein of about 360 to 400 amino-acid residues.

    Proteins where this domain is known:
    PFF1105c   


    PTHR21091 - METHYLTETRAHYDROFOLATE:HOMOCYSTEINE METHYLTRANSFERASE RELATED (Panther link)

    Proteins where this domain is known:
    PFF0360w   


    PTHR21091:SF2 - HemE (Panther link)

    Interpro entry IPR006361 : Uroporphyrinogen decarboxylase HemE (Interpro link)

    Interpro description:

    This entry represents uroporphyrinogen decarboxylase (HemE), which catalyzes the fifth step in the haem biosynthetic pathway, converting uroporphyrinogen III to coproporphyrinogen III by decarboxylating the four acetate side chains of the substrate. This step takes the pathway toward protoporphyrin IX, a common precursor of both haem and chlorophyll, rather than toward precorrin 2 and its products.

    This activity is essential in all organisms, and subnormal activity of URO-D leads to the most common form of porphyria in humans, porphyria cutanea tarda (PCT).

    Proteins where this domain is known:
    PFF0360w   


    PTHR21094 - PTHR21094 (Panther link)

    Proteins where this domain is known:
    PF14_0659   


    PTHR21095 - PTHR21095 (Panther link)

    Proteins where this domain is known:
    PF10_0215   


    PTHR21096 - DUF842_euk (Panther link)

    Interpro entry IPR008560 : (Interpro link)

    Interpro description:
    This family consists of a number of conserved eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    PFB0620w   


    PTHR21100 - PTHR21100 (Panther link)

    Proteins where this domain is known:
    PFI0220w   


    PTHR21100:SF1 - PTHR21100:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0220w   


    PTHR21136 - SNARE PROTEINS (Panther link)

    Proteins where this domain is known:
    MAL13P1.135    MAL13P1.16    PFC0890w    PFI0515w   


    PTHR21136:SF4 - SNARE PROTEIN SEC22 (Panther link)

    Proteins where this domain is known:
    PFC0890w   


    PTHR21136:SF5 - SNARE PROTEIN YKT6 (Panther link)

    Proteins where this domain is known:
    MAL13P1.135    PFI0515w   


    PTHR21136:SF9 - gb def: Arabidopsis thaliana At5g22360/MWD9_16, putative (Panther link)

    Proteins where this domain is known:
    MAL13P1.16   


    PTHR21139 - Triophos_ismrse (Panther link)

    Interpro entry IPR000652 : Triosephosphate isomerase (Interpro link)

    Interpro description:

    Triosephosphate isomerase (TIM) is the glycolytic enzyme that catalyses the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism. The sequence around the active site residue is perfectly conserved in all known TIM's. Deficiencies in TIM are associated with haemolytic anaemia coupled with a progressive, severe neurological disorder.

    Proteins where this domain is known:
    PF14_0378    PFC0831w   


    PTHR21141 - PTHR21141 (Panther link)

    Proteins where this domain is known:
    MAL13P1.341    PF11_0043    PF11_0313    PFC0400w   


    PTHR21141:SF2 - PTHR21141:SF2 (Panther link)

    Proteins where this domain is known:
    MAL13P1.341   


    PTHR21141:SF3 - PTHR21141:SF3 (Panther link)

    Proteins where this domain is known:
    PF11_0313   


    PTHR21141:SF5 - 60S ACIDIC RIBOSOMAL PROTEIN P2 (Panther link)

    Proteins where this domain is known:
    PFC0400w   


    PTHR21141:SF6 - PTHR21141:SF6 (Panther link)

    Proteins where this domain is known:
    PF11_0043   


    PTHR21148 - PTHR21148 (Panther link)

    Proteins where this domain is known:
    PF10_0066    PF10_0359    PFE0820c   


    PTHR21148:SF1 - PTHR21148:SF1 (Panther link)

    Proteins where this domain is known:
    PFE0820c   


    PTHR21148:SF11 - PTHR21148:SF11 (Panther link)

    Proteins where this domain is known:
    PF10_0066    PF10_0359   


    PTHR21183 - Ribosomal_L47_M (Panther link)

    Interpro entry IPR010729 : Ribosomal protein L47, mitochondrial (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure.

    Proteins where this domain is known:
    PFC0675c   


    PTHR21220 - PTHR21220 (Panther link)

    Proteins where this domain is known:
    MAL13P1.191   


    PTHR21229 - Lung7_TM_rcpt (Panther link)

    Interpro entry IPR009637 : Transmembrane receptor, eukaryota (Interpro link)

    Interpro description:

    This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins.

    Proteins where this domain is known:
    PFL0765w   


    PTHR21229:SF1 - PTHR21229:SF1 (Panther link)

    Proteins where this domain is known:
    PFL0765w   


    PTHR21230 - PTHR21230 (Panther link)

    Proteins where this domain is known:
    PF11_0119    PF14_0464    PFL1740w   


    PTHR21230:SF1 - MEMBRIN (Panther link)

    Proteins where this domain is known:
    PF11_0119   


    PTHR21230:SF2 - NOVEL PLANT SNARE (Panther link)

    Proteins where this domain is known:
    PF14_0464   


    PTHR21231 - PTHR21231 (Panther link)

    Proteins where this domain is known:
    PF13_0261    PFI0865w    PFL0075w   


    PTHR21231:SF1 - PTHR21231:SF1 (Panther link)

    Proteins where this domain is known:
    PFL0075w   


    PTHR21231:SF2 - PTHR21231:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0261   


    PTHR21231:SF3 - PTHR21231:SF3 (Panther link)

    Proteins where this domain is known:
    PFI0865w   


    PTHR21234 - PNP_UDP (Panther link)

    Interpro entry IPR018017 : (Interpro link)

    Interpro description:

    The following phosphorylases belong to the same family:

    It should be noted that mammalian and some bacterial PNP as well as eukaryotic MTA phosphorylase belong to a different family of phosphorylases.

    Proteins where this domain is known:
    PFE0660c   


    PTHR21234:SF7 - PURINE NUCLEOSIDE PHOSPHORYLASE (Panther link)

    Proteins where this domain is known:
    PFE0660c   


    PTHR21236 - PTHR21236 (Panther link)

    Proteins where this domain is known:
    PF14_0689   


    PTHR21236:SF2 - PTHR21236:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0689   


    PTHR21237 - GrpE (Panther link)

    Interpro entry IPR000740 : GrpE nucleotide exchange factor (Interpro link)

    Interpro description:

    Molecular chaperones are a diverse family of proteins that function to protect proteins in the intracellular milieu from irreversible aggregation during synthesis and in times of cellular stress. The bacterial molecular chaperone DnaK is an enzyme that couples cycles of ATP binding, hydrolysis, and ADP release by an N-terminal ATP-hydrolysing domain to cycles of sequestration and release of unfolded proteins by a C-terminal substrate binding domain. In prokaryotes the grpE protein. Dimeric GrpE is the co-chaperone for DnaK, and acts as a nucleotide exchange factor, stimulating the rate of ADP release 5000-fold. DnaK is itself a weak ATPase; ATP hydrolysis by DnaK is stimulated by its interaction with another co-chaperone, DnaJ. Thus the co-chaperones DnaJ and GrpE are capable of tightly regulating the nucleotide-bound and substrate-bound state of DnaK in ways that are necessary for the normal housekeeping functions and stress-related functions of the DnaK molecular chaperone cycle.

    The X-ray crystal structure of GrpE in complex with the ATPase domain of DnaK revealed that GrpE is an asymmetric homodimer, bent in a manner that favours extensive contacts with only one DnaKATPase monomer. GrpE does not actively compete for the atomic positions occupied by the nucleotide. GrpE and ADP mutually reduce one another's affinity for DnaK 200-fold, and ATP instantly dissociates GrpE from DnaK.

    Proteins where this domain is known:
    PF11_0258   


    PTHR21237:SF5 - PTHR21237:SF5 (Panther link)

    Proteins where this domain is known:
    PF11_0258   


    PTHR21250 - PTHR21250 (Panther link)

    Proteins where this domain is known:
    PFA0315w   


    PTHR21255 - Tctex (Panther link)

    Interpro entry IPR005334 : (Interpro link)

    Interpro description:

    Tctex-1 is a dynein light chain. Dynein translocates rhodopsin-bearing vesicles along microtubules and it has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. An efficient vectorial transport system must be required to deliver large numbers of newly synthesized rhodopsin molecules (~107 molecules per day per photoreceptor) to the base of the outer segment of the photoreceptor, Tctex-1 may well play a role in this process. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit the interaction between Tctex-1 and rhodopsin, which may be the molecular basis of retinitis pigmentosa.

    In the mouse, the chromosomal location and pattern of expression of Tctex-1 make it a candidate for involvement in male sterility.

    Proteins where this domain is known:
    PF11_0148    PFE1173c    PFI1350c   


    PTHR21255:SF4 - T-COMPLEX-ASSOCIATED-TESTIS-EXPRESSED 1-RELATED (Panther link)

    Proteins where this domain is known:
    PFE1173c    PFI1350c   


    PTHR21290 - PTHR21290 (Panther link)

    Proteins where this domain is known:
    PFF1210w    PFF1215w   


    PTHR21290:SF2 - PTHR21290:SF2 (Panther link)

    Proteins where this domain is known:
    PFF1210w    PFF1215w   


    PTHR21297 - PTHR21297 (Panther link)

    Proteins where this domain is known:
    PFB0245c   


    PTHR21320 - CtaG_Cox11 (Panther link)

    Interpro entry IPR007533 : Cytochrome c oxidase assembly protein CtaG/Cox11 (Interpro link)

    Interpro description:
    Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C-terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae . Met 224 is also thought to play a role in copper transfer or stabilising the copper site.

    Proteins where this domain is known:
    PF10_0267    PF14_0721   


    PTHR21321 - PTHR21321 (Panther link)

    Proteins where this domain is known:
    MAL13P1.36    PFD0515w   


    PTHR21321:SF1 - PTHR21321:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.36   


    PTHR21321:SF2 - PTHR21321:SF2 (Panther link)

    Proteins where this domain is known:
    PFD0515w   


    PTHR21329 - PTHR21329 (Panther link)

    Proteins where this domain is known:
    PFF0915w   


    PTHR21329:SF4 - PTHR21329:SF4 (Panther link)

    Proteins where this domain is known:
    PFF0915w   


    PTHR21330 - UNCHARACTERIZED (Panther link)

    Proteins where this domain is known:
    PF14_0488   


    PTHR21338 - PTHR21338 (Panther link)

    Proteins where this domain is known:
    PFF0205w   


    PTHR21340 - PTHR21340 (Panther link)

    Proteins where this domain is known:
    PFE1035c   


    PTHR21347 - CLPTM1 (Panther link)

    Interpro entry IPR008429 : (Interpro link)

    Interpro description:
    This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the Homo sapiens cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis.

    Proteins where this domain is known:
    PF11_0384   


    PTHR21349 - Ribosomal_L21p (Panther link)

    Interpro entry IPR001787 : Ribosomal protein L21 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L21 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L21 is known to bind to the 23S rRNA in the presence of L20. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    Bacterial L21 is a protein of about 100 amino-acid residues, the mature form of the spinach chloroplast L21 has 200 residues.

    Proteins where this domain is known:
    PF08_0014    PF14_0212   


    PTHR21368 - Ribosomal_L9 (Panther link)

    Interpro entry IPR000244 : Ribosomal protein L9 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L9 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L9 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins grouped on the basis of sequence similarities.

    The crystal structure of Bacillus stearothermophilus L9 shows the 149-residue protein comprises two globular domains connected by a rigid linker. Each domain contains an rRNA binding site, and the protein functions as a structural protein in the large subunit of the ribosome. The C-terminal domain consists of two loops, an alpha-helix and a three-stranded mixed parallel, anti-parallel beta-sheet packed against the central alpha-helix. The long central alpha-helix is exposed to solvent in the middle and participates in the hydrophobic cores of the two domains at both ends.

    Proteins where this domain is known:
    MAL13P1.318   


    PTHR21377 - UNCHARACTERIZED (Panther link)

    Proteins where this domain is known:
    MAL13P1.150   


    PTHR21396 - PTHR21396 (Panther link)

    Proteins where this domain is known:
    PF10_0097   


    PTHR21428 - MED7 (Panther link)

    Interpro entry IPR009244 : MED7 (Interpro link)

    Interpro description:

    This family consists of several eukaryotic proteins, which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multistep process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus.

    Proteins where this domain is known:
    PF08_0037   


    PTHR21428:SF1 - PTHR21428:SF1 (Panther link)

    Proteins where this domain is known:
    PF08_0037   


    PTHR21431 - PTHR21431 (Panther link)

    Proteins where this domain is known:
    PFE0595w   


    PTHR21445 - AP_endnuclease2 (Panther link)

    Interpro entry IPR001719 : AP endonuclease, family 2 (Interpro link)

    Interpro description:

    DNA damaging agents such as the anti-tumour drugs bleomycin and neocarzinostatin or those that generate oxygen radicals produce a variety of lesions in DNA. Amongst these is base-loss which forms apurinic/apyrimidinic (AP) sites or strand breaks with atypical 3' termini. DNA repair at the AP sites is initiated by specific endonuclease cleavage of the phosphodiester backbone. Such endonucleases are also generally capable of removing blocking groups from the 3' terminus of DNA strand breaks.

    AP endonucleases can be classified into two families based on sequence similarity. Family 2 groups the enzymes listed below.

    Escherichia coli endonuclease IV and its S. cerevisiae homologue Apn1 have been shown to be transition metalloproteins that require zinc and manganese for activity.

    Proteins where this domain is known:
    PF13_0176   


    PTHR21454 - PTHR21454 (Panther link)

    Proteins where this domain is known:
    PFL0860c   


    PTHR21454:SF2 - PTHR21454:SF2 (Panther link)

    Proteins where this domain is known:
    PFL0860c   


    PTHR21466 - FAMILY NOT NAMED (Panther link)

    Proteins where this domain is known:
    PF14_0022   


    PTHR21481 - PTHR21481 (Panther link)

    Proteins where this domain is known:
    PFF0920c   


    PTHR21490 - PTHR21490 (Panther link)

    Proteins where this domain is known:
    PF14_0634   


    PTHR21493 - PTHR21493 (Panther link)

    Proteins where this domain is known:
    PFD0930w   


    PTHR21500 - CofA_tubulin_bd (Panther link)

    Interpro entry IPR004226 : Tubulin binding cofactor A (Interpro link)

    Interpro description:

    The folding pathway of tubulins includes highly specific interactions with a series of cofactors (A, B, C, D and E) after they are released from the eukaryotic chaperonin CCT. Cofactors A and D capture and stabilise tubulin in a quasi-native conformation. Cofactor E binds to the cofactor D-tubulin complex, and interaction with cofactor C then causes the release of tubulin poypeptides in the native state. This family is the tubulin-specific chaperone A.

    Proteins where this domain is known:
    PFA0460c   


    PTHR21518 - PTHR21518 (Panther link)

    Proteins where this domain is known:
    PF13_0050   


    PTHR21531 - PTHR21531 (Panther link)

    Proteins where this domain is known:
    PFB0655c   


    PTHR21535 - PTHR21535 (Panther link)

    Proteins where this domain is known:
    PFL2065c   


    PTHR21535:SF2 - PTHR21535:SF2 (Panther link)

    Proteins where this domain is known:
    PFL2065c   


    PTHR21549 - PTHR21549 (Panther link)

    Proteins where this domain is known:
    PFC0815c   


    PTHR21568 - PTHR21568 (Panther link)

    Proteins where this domain is known:
    PF14_0023   


    PTHR21569 - Ribosomal_S9 (Panther link)

    Interpro entry IPR000754 : Ribosomal protein S9 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S9 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial; algal chloroplast; cyanelle and archaeal S9 proteins; and mammalian; plant; and yeast mitochondrial ribosomal S9 proteins.

    Proteins where this domain is known:
    PF08_0076    PF11_0382    PF14_0132   


    PTHR21573 - PTHR21573 (Panther link)

    Proteins where this domain is known:
    MAL8P1.105   


    PTHR21575 - PTHR21575 (Panther link)

    Proteins where this domain is known:
    PF07_0018    PF14_0312    PFI0210c   


    PTHR21575:SF1 - PTHR21575:SF1 (Panther link)

    Proteins where this domain is known:
    PF07_0018   


    PTHR21575:SF4 - PTHR21575:SF4 (Panther link)

    Proteins where this domain is known:
    PFI0210c   


    PTHR21575:SF5 - PTHR21575:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0312   


    PTHR21600 - PTHR21600 (Panther link)

    Proteins where this domain is known:
    PFB0890c    PFL1350w    PFL1380w   


    PTHR21600:SF2 - PTHR21600:SF2 (Panther link)

    Proteins where this domain is known:
    PFB0890c    PFL1350w    PFL1380w   


    PTHR21650 - PTHR21650 (Panther link)

    Proteins where this domain is known:
    PFC0720w   


    PTHR21650:SF2 - Nuf2 (Panther link)

    Interpro entry IPR005549 : Kinetochore protein Nuf2 (Interpro link)

    Interpro description:

    Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family. The match is not strong, but in common with other members of this family contains coiled-coil to the C-terminus of this region.

    Proteins where this domain is known:
    PFC0720w   


    PTHR21651 - UNCHARACTERIZED (Panther link)

    Proteins where this domain is known:
    PF14_0651   


    PTHR21668 - TIF_eIF-1A (Panther link)

    Interpro entry IPR001253 : Translation initiation factor 1A (eIF-1A) (Interpro link)

    Interpro description:

    Eukaryotic translation initiation factor A (eIF-1A) (formerly known as eiF-4C) is a protein that seems to be required for maximal rate of protein biosynthesis. It enhances ribosome dissociation into subunits and stabilizes the binding of the initiator Met-tRNA to 40S ribosomal subunits. The archaea possess an eIF-1A homolog.

    Proteins where this domain is known:
    PF11_0447   


    PTHR21680 - DUF1014 (Panther link)

    Interpro entry IPR010422 : (Interpro link)

    Interpro description:

    This family consists of several hypothetical eukaryotic proteins of unknown function.

    Proteins where this domain is known:
    MAL8P1.10   


    PTHR21706 - PTHR21706 (Panther link)

    Proteins where this domain is known:
    PF11_0467    PFB0510w   


    PTHR21706:SF4 - PTHR21706:SF4 (Panther link)

    Proteins where this domain is known:
    PF11_0467   


    PTHR21711 - PTHR21711 (Panther link)

    Proteins where this domain is known:
    PF14_0396   


    PTHR21713 - PTHR21713 (Panther link)

    Proteins where this domain is known:
    PFF1050w   


    PTHR21716 - UPF0118 (Panther link)

    Interpro entry IPR002549 : (Interpro link)

    Interpro description:

    This is a family of hypothetical proteins. A number of the sequence records state they are transmembrane proteins or putative permeases. It is not clear what source suggested that these proteins might be permeases and this information should be treated with caution.

    Proteins where this domain is known:
    PFF0720w   


    PTHR21726 - PTHR21726 (Panther link)

    Proteins where this domain is known:
    PFI1705w   


    PTHR21727 - PHOSPHORYLATED CTD INTERACTING FACTOR 1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.49   


    PTHR21737 - PTHR21737 (Panther link)

    Proteins where this domain is known:
    PF08_0124   


    PTHR21737:SF4 - PTHR21737:SF4 (Panther link)

    Proteins where this domain is known:
    PF08_0124   


    PTHR21738 - DUF947 (Panther link)

    Interpro entry IPR009292 : (Interpro link)

    Interpro description:

    This is a family of eukaryotic proteins with unknown function.

    Proteins where this domain is known:
    PFA0475c   


    PTHR22069 - PTHR22069 (Panther link)

    Proteins where this domain is known:
    PFL0760w   


    PTHR22455 - PTHR22455 (Panther link)

    Proteins where this domain is known:
    PF07_0055   


    PTHR22455:SF9 - PTHR22455:SF9 (Panther link)

    Proteins where this domain is known:
    PF07_0055   


    PTHR22504 - Maf1 (Panther link)

    Interpro entry IPR015257 : (Interpro link)

    Interpro description:

    Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB.

    Proteins where this domain is known:
    PFD0800c   


    PTHR22572 - PTHR22572 (Panther link)

    Proteins where this domain is known:
    MAL13P1.144    PF14_0774    PFL0675c   


    PTHR22572:SF15 - PTHR22572:SF15 (Panther link)

    Proteins where this domain is known:
    PF14_0774   


    PTHR22572:SF7 - PTHR22572:SF7 (Panther link)

    Proteins where this domain is known:
    PFL0675c   


    PTHR22572:SF8 - PTHR22572:SF8 (Panther link)

    Proteins where this domain is known:
    MAL13P1.144   


    PTHR22573 - PTHR22573 (Panther link)

    Proteins where this domain is known:
    PF10_0121    PF10_0122    PF11_0311   


    PTHR22573:SF1 - PTHR22573:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0311   


    PTHR22573:SF2 - PTHR22573:SF2 (Panther link)

    Proteins where this domain is known:
    PF10_0122   


    PTHR22573:SF8 - PTHR22573:SF8 (Panther link)

    Proteins where this domain is known:
    PF10_0121   


    PTHR22594 - aa-tRNA-synt_II (Panther link)

    Interpro entry IPR018150 : Aminoacyl-tRNA synthetase, class II (D, K and N)-like (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    This entry includes the asparagine, aspartic acid, lysine, and pyrrolysyl tRNA synthetases. Pyrrolysine is a lysine derivative with a bulky pyrroline ring.

    Proteins where this domain is known:
    PF13_0262    PF14_0166    PFA0145c    PFB0525w    PFE0475w    PFE0715w   


    PTHR22594:SF4 - tRNA-synt_lys_2 (Panther link)

    Interpro entry IPR018149 : Lysyl-tRNA synthetase, class II, C-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Lysyl-tRNA synthetase is an alpha 2 homodimer that belong to both class I and class II. In eubacteria and eukaryota lysyl-tRNA synthetases belong to class II in the same family as aspartyl tRNA synthetase. The class Ic lysyl-tRNA synthetase family is present in archaea and some eubacteria. Moreover in some eubacteria there is a gene X, which is similar to a part of lysyl-tRNA synthetase from class II. Lysyl-tRNA synthetase is duplicated in some species with, for example in Escherichia coli, as a constitutive gene (lysS) and an induced one (lysU). No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. Lysine is activated by being attached to the alpha-phosphate of AMP before being transferred to the cognate tRNA. The refined crystal structures give "snapshots" of the active site corresponding to key steps in the aminoacylation reaction and provide the structural framework for understanding the mechanism of lysine activation. The active site of LysU is shaped to position the substrates for the nucleophilic attack of the lysine carboxylate on the ATP alpha-phosphate. No residues are directly involved in catalysis, but a number of highly conserved amino acids and three metal ions coordinate the substrates and stabilise the pentavalent transition state. A loop close to the catalytic pocket, disordered in the lysine-bound structure, becomes ordered upon adenine binding.

    Proteins where this domain is known:
    PF13_0262    PF14_0166   


    PTHR22594:SF6 - PTHR22594:SF6 (Panther link)

    Proteins where this domain is known:
    PFA0145c    PFB0525w    PFE0475w    PFE0715w   


    PTHR22603 - Choline/ethanolamine_kinase (Panther link)

    Interpro entry IPR002573 : (Interpro link)

    Interpro description:

    Choline kinase, (ATP:choline phosphotransferase) belongs to the choline/ethanolamine kinase family.

    Ethanolamine and choline are major membrane phospholipids, in the form of glycerophosphoethanolamine and glycerophosphocholine. Ethanolamine is also a component of the glycosylphosphatidylinositol (GPI) anchor, which is necessary for cell-surface protein attachment. The de novo synthesis of these phospholipids begins with the creation of phosphoethanolamine and phosphocholine by ethanolamine and choline kinases in the first step of the CDP-ethanolamine pathway. There are two putative choline/ethanolamine kinases (C/EKs) in the Trypanosoma brucei genome.

    Ethanolamine kinase has no choline kinase activity and its activity is inhibited by ADP. Inositol supplementation represses ethanolamine kinase, decreasing the incorporation of ethanolamine into the CDP-ethanolamine pathway and into phosphatidylethanolamine and phosphatidylcholine.

    Proteins where this domain is known:
    PF11_0257    PF14_0020   


    PTHR22603:SF7 - PTHR22603:SF7 (Panther link)

    Proteins where this domain is known:
    PF11_0257    PF14_0020   


    PTHR22630 - PTHR22630 (Panther link)

    Proteins where this domain is known:
    PFL1200c   


    PTHR22683 - PTHR22683 (Panther link)

    Proteins where this domain is known:
    PFC0905c   


    PTHR22683:SF27 - PTHR22683:SF27 (Panther link)

    Proteins where this domain is known:
    PFC0905c   


    PTHR22684 - bHLH_Nulp1 (Panther link)

    Interpro entry IPR006994 : (Interpro link)

    Interpro description:

    This entry appears to represent a novel family of basic helix-loop-helix (bHLH) proteins that control differentiation and development of a variety of organs.

    Human Nulp1 is a basic helix-loop-helix protein expressed broadly during early embryonic organogenesis. Over expression of human Nulp1 in COS-7 cells inhibits the transcriptional activity of serum response factor (SRF), suggesting that Nulp1 may act as a novel bHLH transcriptional repressor in the SRF signalling pathway to mediate cellular functions.

    Proteins where this domain is known:
    PFE0335w   


    PTHR22731 - PTHR22731 (Panther link)

    Proteins where this domain is known:
    MAL7P1.28   


    PTHR22734 - PTHR22734 (Panther link)

    Proteins where this domain is known:
    PF08_0055    PFI1070c   


    PTHR22748 - ExoIII_xth (Panther link)

    Interpro entry IPR004808 : Exodeoxyribonuclease III xth (Interpro link)

    Interpro description:
    All proteins in this family for which functions are known are 5' AP endonucleases that function in base excision repair and the repair of abasic sites in DNA.

    Proteins where this domain is known:
    PF14_0285    PFC0250c   


    PTHR22749 - Riboflavin_kinase (Panther link)

    Interpro entry IPR015865 : Riboflavin kinase (Interpro link)

    Interpro description:

    Riboflavin is converted into catalytically active cofactors (FAD and FMN) by the actions of riboflavin kinase, which converts it into FMN, and FAD synthetase, which adenylates FMN to FAD. Eukaryotes usually have two separate enzymes, while most prokaryotes have a single bifunctional protein that can carry out both catalyses, although exceptions occur in both cases. While eukaryotic monofunctional riboflavin kinase is orthologous to the bifunctional prokaryotic enzyme, the monofunctional FAD synthetase differs from its prokaryotic counterpart, and is instead related to the PAPS-reductase family. The bacterial FAD synthetase that is part of the bifunctional enzyme has remote similarity to nucleotidyl transferases and, hence, it may be involved in the adenylylation reaction of FAD synthetases.

    This entry represents riboflavin kinase, which occurs as part of a bifunctional enzyme or a stand-alone enzyme.

    Proteins where this domain is known:
    MAL13P1.292   


    PTHR22760 - Alg9_trans (Panther link)

    Interpro entry IPR005599 : Alg9-like mannosyltransferase (Interpro link)

    Interpro description:

    Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. In yeast the SMP3 (YOR149C) has been implemented in plasmid stability.

    Proteins where this domain is known:
    MAL13P1.210   


    PTHR22761 - PTHR22761 (Panther link)

    Proteins where this domain is known:
    PF14_0397    PFL2090c   


    PTHR22764 - PTHR22764 (Panther link)

    Proteins where this domain is known:
    PF10_0276    PFF0355c   


    PTHR22766 - PTHR22766 (Panther link)

    Proteins where this domain is known:
    PF10_0072    PF13_0188    PFE1490c   


    PTHR22766:SF2 - PTHR22766:SF2 (Panther link)

    Proteins where this domain is known:
    PF13_0188   


    PTHR22767 - PTHR22767 (Panther link)

    Proteins where this domain is known:
    PFL2120w   


    PTHR22767:SF2 - PTHR22767:SF2 (Panther link)

    Proteins where this domain is known:
    PFL2120w   


    PTHR22780 - PTHR22780 (Panther link)

    Proteins where this domain is known:
    PF14_0529    PFF0830w    PFI0200c   


    PTHR22780:SF5 - PTHR22780:SF5 (Panther link)

    Proteins where this domain is known:
    PF14_0529   


    PTHR22780:SF6 - PTHR22780:SF6 (Panther link)

    Proteins where this domain is known:
    PFI0200c   


    PTHR22781 - DELTA ADAPTIN-RELATED (Panther link)

    Proteins where this domain is known:
    MAL8P1.123   


    PTHR22798 - PTHR22798 (Panther link)

    Proteins where this domain is known:
    PFE1470w   


    PTHR22807 - PTHR22807 (Panther link)

    Proteins where this domain is known:
    PF10_0197    PF11_0305    PFL1475w   


    PTHR22807:SF11 - PTHR22807:SF11 (Panther link)

    Proteins where this domain is known:
    PF11_0305   


    PTHR22807:SF14 - PTHR22807:SF14 (Panther link)

    Proteins where this domain is known:
    PFL1475w   


    PTHR22808 - PTHR22808 (Panther link)

    Proteins where this domain is known:
    PF07_0015    PF11_0116   


    PTHR22808:SF1 - PTHR22808:SF1 (Panther link)

    Proteins where this domain is known:
    PF07_0015   


    PTHR22808:SF2 - PTHR22808:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0116   


    PTHR22809 - PTHR22809 (Panther link)

    Proteins where this domain is known:
    PFL2305w   


    PTHR22811 - PTHR22811 (Panther link)

    Proteins where this domain is known:
    MAL13P1.171    PF13_0082    PFE1340w   


    PTHR22811:SF11 - PTHR22811:SF11 (Panther link)

    Proteins where this domain is known:
    PF13_0082   


    PTHR22811:SF2 - PTHR22811:SF2 (Panther link)

    Proteins where this domain is known:
    MAL13P1.171   


    PTHR22835 - PTHR22835 (Panther link)

    Proteins where this domain is known:
    PF14_0574   


    PTHR22835:SF42 - PTHR22835:SF42 (Panther link)

    Proteins where this domain is known:
    PF14_0574   


    PTHR22836 - PTHR22836 (Panther link)

    Proteins where this domain is known:
    PFL1975c   


    PTHR22838 - PTHR22838 (Panther link)

    Proteins where this domain is known:
    PFE0930w   


    PTHR22840 - PTHR22840 (Panther link)

    Proteins where this domain is known:
    PFF1000w   


    PTHR22840:SF1 - PTHR22840:SF1 (Panther link)

    Proteins where this domain is known:
    PFF1000w   


    PTHR22842 - PTHR22842 (Panther link)

    Proteins where this domain is known:
    MAL8P1.145   


    PTHR22844 - PTHR22844 (Panther link)

    Proteins where this domain is known:
    PF11_0222   


    PTHR22847 - PTHR22847 (Panther link)

    Proteins where this domain is known:
    MAL13P1.245    MAL13P1.264    MAL8P1.43    PF10_0285    PF11_0056    PF11_0171    PF14_0412    PFE1310c    PFL1470c   


    PTHR22847:SF47 - PTHR22847:SF47 (Panther link)

    Proteins where this domain is known:
    MAL8P1.43   


    PTHR22847:SF48 - PTHR22847:SF48 (Panther link)

    Proteins where this domain is known:
    PF14_0412   


    PTHR22848 - PTHR22848 (Panther link)

    Proteins where this domain is known:
    MAL13P1.54   


    PTHR22850 - PTHR22850 (Panther link)

    Proteins where this domain is known:
    PF08_0065    PF13_0149    PF14_0314    PFA0520c    PFF0395c   


    PTHR22850:SF15 - PTHR22850:SF15 (Panther link)

    Proteins where this domain is known:
    PFA0520c   


    PTHR22850:SF16 - PTHR22850:SF16 (Panther link)

    Proteins where this domain is known:
    PF14_0314   


    PTHR22850:SF17 - PTHR22850:SF17 (Panther link)

    Proteins where this domain is known:
    PFF0395c   


    PTHR22850:SF18 - PTHR22850:SF18 (Panther link)

    Proteins where this domain is known:
    PF13_0149   


    PTHR22850:SF5 - PTHR22850:SF5 (Panther link)

    Proteins where this domain is known:
    PF08_0065   


    PTHR22851 - PTHR22851 (Panther link)

    Proteins where this domain is known:
    PFD0455w   


    PTHR22854 - PTHR22854 (Panther link)

    Proteins where this domain is known:
    MAL13P1.319   


    PTHR22854:SF2 - PTHR22854:SF2 (Panther link)

    Proteins where this domain is known:
    MAL13P1.319   


    PTHR22870 - PTHR22870 (Panther link)

    Proteins where this domain is known:
    MAL7P1.38    PF11_0385    PF11_0448    PF13_0303    PFD0145c    PFD0900w    PFE0420c   


    PTHR22870:SF10 - PTHR22870:SF10 (Panther link)

    Proteins where this domain is known:
    PF13_0303   


    PTHR22870:SF18 - PTHR22870:SF18 (Panther link)

    Proteins where this domain is known:
    PF11_0385   


    PTHR22870:SF6 - PTHR22870:SF6 (Panther link)

    Proteins where this domain is known:
    PFD0900w   


    PTHR22870:SF8 - PTHR22870:SF8 (Panther link)

    Proteins where this domain is known:
    PFE0420c   


    PTHR22880 - FALZ-RELATED BROMODOMAIN-CONTAINING PROTEINS (Panther link)

    Proteins where this domain is known:
    PF08_0034   


    PTHR22880:SF6 - HISTONE ACETYLTRANSFERASE GCN5 (Panther link)

    Proteins where this domain is known:
    PF08_0034   


    PTHR22881 - PTHR22881 (Panther link)

    Proteins where this domain is known:
    PF11_0073    PFD0980w   


    PTHR22881:SF1 - PTHR22881:SF1 (Panther link)

    Proteins where this domain is known:
    PF11_0073    PFD0980w   


    PTHR22883 - PTHR22883 (Panther link)

    Proteins where this domain is known:
    MAL13P1.117    MAL13P1.126    MAL7P1.68    PF10_0273    PF11_0167    PF11_0217    PFB0140w    PFB0725c    PFC0160w    PFE1415w    PFF0485c    PFI1580c   


    PTHR22884 - PTHR22884 (Panther link)

    Proteins where this domain is known:
    MAL13P1.122    PF08_0012    PFD0190w    PFF1440w   


    PTHR22884:SF17 - PTHR22884:SF17 (Panther link)

    Proteins where this domain is known:
    PFF1440w   


    PTHR22884:SF25 - PTHR22884:SF25 (Panther link)

    Proteins where this domain is known:
    PF08_0012   


    PTHR22884:SF26 - PTHR22884:SF26 (Panther link)

    Proteins where this domain is known:
    PFD0190w   


    PTHR22884:SF35 - PTHR22884:SF35 (Panther link)

    Proteins where this domain is known:
    MAL13P1.122   


    PTHR22888 - COX2_C (Panther link)

    Interpro entry IPR002429 : Cytochrome c oxidase subunit II C-terminal (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex which is a component of the respiratory chain and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane. The number of polypeptides in the complex ranges from 3-4 (prokaryotes), up to 13(mammals).

    Subunit 2 (CO II) transfers the electrons from cytochrome c to the catalytic subunit 1. It contains two adjacent transmembrane regions in its N-terminus and the major part of the protein is exposed to the periplasmic or to the mitochondrial intermembrane space, respectively. CO II provides the substrate-binding site and contains a copper centre called Cu(A), probably the primary acceptor in cytochrome c oxidase. An exception is the corresponding subunit of the cbb3-type oxidase which lacks the copper A redox-centre. Several bacterial CO II have a C-terminal extension that contains a covalently bound haem c.

    Proteins where this domain is known:
    PF13_0327    PF14_0288   


    PTHR22896 - PTHR22896 (Panther link)

    Proteins where this domain is known:
    PFF0270c   


    PTHR22897 - PTHR22897 (Panther link)

    Proteins where this domain is known:
    PFL2020c   


    PTHR22904 - PTHR22904 (Panther link)

    Proteins where this domain is known:
    MAL13P1.18    PF07_0026    PF11_0101    PF14_0324    PFB0610c    PFC0515c    PFE1370w    PFI1610c    PFL0615w    PFL2015w   


    PTHR22904:SF10 - PTHR22904:SF10 (Panther link)

    Proteins where this domain is known:
    PFI1610c   


    PTHR22904:SF14 - PTHR22904:SF14 (Panther link)

    Proteins where this domain is known:
    PF14_0324   


    PTHR22904:SF15 - PTHR22904:SF15 (Panther link)

    Proteins where this domain is known:
    PFB0610c   


    PTHR22904:SF19 - PTHR22904:SF19 (Panther link)

    Proteins where this domain is known:
    PFC0515c    PFL2015w   


    PTHR22904:SF27 - PTHR22904:SF27 (Panther link)

    Proteins where this domain is known:
    PF07_0026   


    PTHR22904:SF34 - PTHR22904:SF34 (Panther link)

    Proteins where this domain is known:
    PFE1370w   


    PTHR22908 - PTHR22908 (Panther link)

    Proteins where this domain is known:
    PF14_0326   


    PTHR22908:SF11 - PTHR22908:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0326   


    PTHR22912 - PTHR22912 (Panther link)

    Proteins where this domain is known:
    PF07_0085    PF08_0066    PF14_0192    PFI1170c    PFL1550w   


    PTHR22912:SF20 - Lipoamide_DH (Panther link)

    Interpro entry IPR006258 : Dihydrolipoamide dehydrogenase (Interpro link)

    Interpro description:

    These sequences represent dihydrolipoamide dehydrogenase, a flavoprotein that acts in a number of ways. It is the E3 component of dehydrogenase complexes for pyruvate, 2-oxoglutarate, 2-oxoisovalerate, and acetoin. It can also serve as the L protein of the glycine cleavage system. This family includes a few members known to have distinct functions (ferric leghemoglobin reductase and NADH:ferredoxin oxidoreductase) but that may be predicted by homology to act as dihydrolipoamide dehydrogenase as well. The motif GGXCXXXGCXP near the N-terminus contains a redox-active disulphide.

    Proteins where this domain is known:
    PF08_0066    PFL1550w   


    PTHR22912:SF23 - Reduct_Se (Panther link)

    Interpro entry IPR006338 : Thioredoxin and glutathione reductase selenoprotein (Interpro link)

    Interpro description:

    This homodimeric, FAD-containing member of the pyridine nucleotide disulphide oxidoreductase family contains a C-terminal motif Cys-SeCys-Gly, where SeCys is selenocysteine encoded by TGA (in some sequence reports interpreted as a stop codon). In some members of this subfamily, Cys-SeCys-Gly is replaced by Cys-Cys-Gly. The reach of the selenium atom at the C-terminal arm of the protein is proposed to allow broad substrate specificity.

    Proteins where this domain is known:
    PFI1170c   


    PTHR22912:SF27 - PTHR22912:SF27 (Panther link)

    Proteins where this domain is known:
    PF14_0192   


    PTHR22912:SF31 - PTHR22912:SF31 (Panther link)

    Proteins where this domain is known:
    PF07_0085   


    PTHR22915 - PTHR22915 (Panther link)

    Proteins where this domain is known:
    PFI0735c   


    PTHR22915:SF4 - PTHR22915:SF4 (Panther link)

    Proteins where this domain is known:
    PFI0735c   


    PTHR22932 - PTHR22932 (Panther link)

    Proteins where this domain is known:
    PF14_0510    PFC0581w    PFL2450c   


    PTHR22932:SF1 - PTHR22932:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0510    PFC0581w    PFL2450c   


    PTHR22936 - Peptidase_S54_rhomboid (Panther link)

    Interpro entry IPR002610 : Peptidase S54, rhomboid (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of proteins contain serine peptidases belonging to the MEROPS peptidase family S54 (Rhomboid, clan S-). They are integral membrane proteins related to the Drosophila melanogaster (Fruit fly) rhomboid protein Members of this family are found in archaea, bacteria and eukaryotes.

    The D. melanogaster rhomboid protease cleaves type-1 transmembrane domains using a catalytic triad composed of serine, histidine and asparagine contributed by different transmembrane domains. It cleaves the transmembrane proteins Spitz, Gurken and Keren within their transmembrane domains to release a soluble TGFalpha-like growth factor. Cleavage occurs in the Golgi, following translocation of the substrates from the endoplasmic reticulum membrane by Star, another transmembrane protein. The growth factors are then able to activate the epidermal growth factor receptor.

    Few substrates of mammalian rhomboid homologues have been determined, but rhomboid-like protein 2 (MEROPS S54.002) has been shown to cleave ephrin B3. Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite.

    In Saccharomyces cerevisiae (Baker's yeast) the Pcp1 (MDM37) protein (MEROPS S54.007) is a mitochondrial endopeptidase required for the activation of cytochrome c peroxidase and for the processing of the mitochondrial dynamin-like protein Mgm1. Mutations in Pcp1 result in cells have fragmented mitochondria, which have very few short tubulues.

    Proteins where this domain is known:
    MAL8P1.16    PF11_0150    PF14_0110    PFE0340c    PFF0900c   


    PTHR22938 - PTHR22938 (Panther link)

    Proteins where this domain is known:
    PF14_0479   


    PTHR22939 - SERINE PROTEASE FAMILY S1C HTRA-RELATED (Panther link)

    Proteins where this domain is known:
    MAL8P1.126    MAL8P1.98   


    PTHR22939:SF1 - Pept_DepP2 (Panther link)

    Interpro entry IPR015724 : (Interpro link)

    Interpro description:

    The DegP/Htr family in Prokaryota, including cyanobacteria from which chloroplasts derive, consists of three serine-type endopeptidases: DegP (also named HtrA), DegQ (also named HhoA) and DegS (also named HtrH or HhoB). Consistent with the prokaryotic origins of chloroplasts a Arabidopsis thaliana (Mouse-ear cress) DegP2 gene encoding a novel chloroplast homologue of the prokaryotic trypsin-type Deg/Htr serine proteases has been identified.

    DegP is essential for bacterial survival at temperatures above 42 degrees and for digesting misfolded protein in the periplasm. Mature DegP from Escherichia coli has 448 residues, of which His105, Asp135, and Ser210 form the catalytic triad. The protein has an N-terminal sequence typical of a leader peptide. Structural analysis indicates that bacterial HtrA is a serine protease belonging to the family of cage-forming proteases and only unfolded polypeptides can be threaded in extended conformation into the cage to access the proteolytic sites. Disulphide bonds of partially unfolded substrates impede protein breakdown and represent a conformational constraint for entering the inner cavity. This preference for unfolded polypeptides might be the reason for the increased proteolytic activity at higher temperatures.

    The DegP/Htr family shares a modular architecture composed of an N-terminal segment believed to have regulatory functions, a conserved trypsin-like protease domain, and one or two PDZ domains, which mediate specific protein-protein interactions and bind preferentially to the C-terminal three to four residues of the target protein. DegP belongs to the trypsin clan SA. SA-type proteases have a two-domain structure with each domain forming a six-stranded barrel. The active site cleft is located at the interface of the two perpendicularly arranged barrel domains. The active site is constructed by several loops located at the C-terminal side of both barrel domains. The functional unit of DegP appears to be a trimer, which is stabilized exclusively by residues of the protease domains. The basic trimer has a funnel-like shape with the protease domains located at its top and the PDZ domains protruding to the outside. Once substrates have been bound, they have to be delivered into the interior of the funnel and the proteolytic sites. In contrast to other protease-chaperone systems, ATP does not drive binding and release of substrates.

    The degQ and degS genes of E. coli encode proteins of 455 and 355 residues that are homologues of the DegP protease. Purified DegQ protein has the properties of a serine endopeptidase, and is processed by the removal of a 27-residue N-terminal signal sequence. Deletion studies suggest that DegQ, like DegP, functions as a periplasmic protease in vivo.

    This entry represents a set of known and suspected serine proteases related to DegP2 from Arabidopsis thaliana. DegP2 is a serine protease that performs the primary cleavage of the photodamaged D1 protein in plant photosystem II.

    Proteins where this domain is known:
    MAL8P1.126   


    PTHR22942 - RECA/RAD51/RADA DNA STRAND-PAIRING FAMILY MEMBER (Panther link)

    Proteins where this domain is known:
    MAL8P1.76    PF11_0087   


    PTHR22942:SF12 - PTHR22942:SF12 (Panther link)

    Proteins where this domain is known:
    PF11_0087   


    PTHR22942:SF13 - MEIOTIC RECOMBINATION PROTEIN DMC1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.76   


    PTHR22950 - PTHR22950 (Panther link)

    Proteins where this domain is known:
    PFF1430c   


    PTHR22957 - PTHR22957 (Panther link)

    Proteins where this domain is known:
    MAL13P1.244    PF11_0151    PF13_0117    PF14_0699    PFE0330w    PFI0195c    PFI0345w    PFL1445w   


    PTHR22957:SF17 - PTHR22957:SF17 (Panther link)

    Proteins where this domain is known:
    PFI0195c   


    PTHR22957:SF26 - PTHR22957:SF26 (Panther link)

    Proteins where this domain is known:
    MAL13P1.244   


    PTHR22957:SF27 - PTHR22957:SF27 (Panther link)

    Proteins where this domain is known:
    PF13_0117   


    PTHR22957:SF28 - PTHR22957:SF28 (Panther link)

    Proteins where this domain is known:
    PF14_0699   


    PTHR22957:SF54 - PTHR22957:SF54 (Panther link)

    Proteins where this domain is known:
    PF11_0151   


    PTHR22957:SF56 - PTHR22957:SF56 (Panther link)

    Proteins where this domain is known:
    PFE0330w    PFI0345w   


    PTHR22959 - PTHR22959 (Panther link)

    Proteins where this domain is known:
    PFL0450c   


    PTHR22967 - PTHR22967 (Panther link)

    Proteins where this domain is known:
    PF13_0258    PF14_0734    PFE0045c    PFI0105c    PFL0040c    PFL2280w   


    PTHR22967:SF29 - PTHR22967:SF29 (Panther link)

    Proteins where this domain is known:
    PFL2280w   


    PTHR22967:SF32 - PTHR22967:SF32 (Panther link)

    Proteins where this domain is known:
    PF14_0734    PFI0105c    PFL0040c   


    PTHR22967:SF33 - PTHR22967:SF33 (Panther link)

    Proteins where this domain is known:
    PF13_0258   


    PTHR22971 - PTHR22971 (Panther link)

    Proteins where this domain is known:
    PF13_0085   


    PTHR22974 - PTHR22974 (Panther link)

    Proteins where this domain is known:
    PF11_0488    PFB0605w   


    PTHR22974:SF2 - PTHR22974:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0488   


    PTHR22974:SF3 - PTHR22974:SF3 (Panther link)

    Proteins where this domain is known:
    PFB0605w   


    PTHR22979 - PTHR22979 (Panther link)

    Proteins where this domain is known:
    PF07_0031    PF10_0039    PF11_0431    PF14_0168    PFC0180c    PFC0185w    PFE1285w   


    PTHR22979:SF1 - PTHR22979:SF1 (Panther link)

    Proteins where this domain is known:
    PF07_0031    PF10_0039    PF11_0431    PF14_0168    PFC0180c    PFC0185w    PFE1285w   


    PTHR22982 - PTHR22982 (Panther link)

    Proteins where this domain is known:
    MAL13P1.267    MAL7P1.73    PF07_0072    PF10_0380    PF11_0239    PF11_0242    PF11_0510    PF13_0211    PF14_0227    PF14_0392    PF14_0476    PF14_0516    PFB0815w    PFC0385c    PFC0420w    PFC0485w    PFC0945w    PFF0260w    PFF0520w    PFI0095c    PFI0110c    PFI0115c    PFI0120c    PFI0125c    PFI1415w    PFL1885c   


    PTHR22982:SF13 - PTHR22982:SF13 (Panther link)

    Proteins where this domain is known:
    MAL13P1.267    PF07_0072    PF11_0239    PF11_0242    PF13_0211    PF14_0227    PFB0815w    PFC0420w    PFF0520w   


    PTHR22982:SF14 - PTHR22982:SF14 (Panther link)

    Proteins where this domain is known:
    PF10_0380    PFL1885c   


    PTHR22982:SF26 - PTHR22982:SF26 (Panther link)

    Proteins where this domain is known:
    PF14_0392    PF14_0476    PFI0095c   


    PTHR22982:SF43 - PTHR22982:SF43 (Panther link)

    Proteins where this domain is known:
    PFI0115c    PFI0120c    PFI0125c   


    PTHR22982:SF47 - PTHR22982:SF47 (Panther link)

    Proteins where this domain is known:
    PF14_0516   


    PTHR22982:SF48 - PTHR22982:SF48 (Panther link)

    Proteins where this domain is known:
    PFF0260w   


    PTHR22982:SF49 - PTHR22982:SF49 (Panther link)

    Proteins where this domain is known:
    PFC0385c   


    PTHR22985 - PTHR22985 (Panther link)

    Proteins where this domain is known:
    MAL13P1.278    PF11_0227    PF14_0346    PFI1290w    PFI1685w    PFL2250c   


    PTHR22985:SF82 - PTHR22985:SF82 (Panther link)

    Proteins where this domain is known:
    PFL2250c   


    PTHR22985:SF84 - CAMP-DEPENDENT PROTEIN KINASE, CATALYTIC SUBUNIT (PKA C) (Panther link)

    Proteins where this domain is known:
    PFI1685w   


    PTHR22985:SF90 - CGMP-DEPENDENT PROTEIN KINASE (Panther link)

    Proteins where this domain is known:
    PF14_0346   


    PTHR22986 - MAPKK-RELATED SERINE/THREONINE PROTEIN KINASES (Panther link)

    Proteins where this domain is known:
    MAL7P1.100    PF11_0464    PFB0150c    PFB0665w    PFE1290w    PFL0080c    PFL1370w   


    PTHR22986:SF25 - NIMA-RELATED PROTEIN KINASE, NEK (Panther link)

    Proteins where this domain is known:
    MAL7P1.100    PFE1290w    PFL1370w   


    PTHR22992 - PTHR22992 (Panther link)

    Proteins where this domain is known:
    PF14_0532    PF14_0723    PFC0640w   


    PTHR22992:SF10 - PTHR22992:SF10 (Panther link)

    Proteins where this domain is known:
    PFC0640w   


    PTHR22992:SF3 - PTHR22992:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0532    PF14_0723   


    PTHR22996 - PTHR22996 (Panther link)

    Proteins where this domain is known:
    PFB0687c    PFC0740c   


    PTHR22997 - Nop17p (Panther link)

    Interpro entry IPR012981 : (Interpro link)

    Interpro description:

    This domain is involved in pre-rRNA processing. It has been shown to be required either for nucleolar retention or correct assembly of the box C/D snoRNP in Saccharomyces cerevisiae.

    Proteins where this domain is known:
    PFL1690w   


    PTHR23001 - PTHR23001 (Panther link)

    Proteins where this domain is known:
    PF10_0103    PFL0335c   


    PTHR23001:SF1 - PTHR23001:SF1 (Panther link)

    Proteins where this domain is known:
    PFL0335c   


    PTHR23003 - PTHR23003 (Panther link)

    Proteins where this domain is known:
    PF10_0068    PFD0775c   


    PTHR23003:SF3 - PTHR23003:SF3 (Panther link)

    Proteins where this domain is known:
    PF10_0068   


    PTHR23041 - PTHR23041 (Panther link)

    Proteins where this domain is known:
    PFD0765w   


    PTHR23041:SF6 - PTHR23041:SF6 (Panther link)

    Proteins where this domain is known:
    PFD0765w   


    PTHR23050 - PTHR23050 (Panther link)

    Proteins where this domain is known:
    MAL7P1.69    PF10_0145    PF14_0181    PF14_0323    PFA0345w    PFD0692c    PFF0265c    PFF1320c   


    PTHR23050:SF17 - PTHR23050:SF17 (Panther link)

    Proteins where this domain is known:
    PFA0345w   


    PTHR23050:SF20 - CALMODULIN (Panther link)

    Proteins where this domain is known:
    PF14_0323   


    PTHR23056 - PTHR23056 (Panther link)

    Proteins where this domain is known:
    PF14_0492   


    PTHR23056:SF4 - PTHR23056:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0492   


    PTHR23061 - PTHR23061 (Panther link)

    Proteins where this domain is known:
    PF14_0602   


    PTHR23063 - PTHR23063 (Panther link)

    Proteins where this domain is known:
    PFI0695c   


    PTHR23063:SF1 - PTHR23063:SF1 (Panther link)

    Proteins where this domain is known:
    PFI0695c   


    PTHR23070 - PTHR23070 (Panther link)

    Proteins where this domain is known:
    PFF0155w   


    PTHR23070:SF2 - PTHR23070:SF2 (Panther link)

    Proteins where this domain is known:
    PFF0155w   


    PTHR23071 - PTHR23071 (Panther link)

    Proteins where this domain is known:
    PFL0685w   


    PTHR23073 - PTHR23073 (Panther link)

    Proteins where this domain is known:
    PF10_0081    PF11_0314    PF13_0033    PF13_0063    PFD0665c    PFL2345c   


    PTHR23073:SF10 - PTHR23073:SF10 (Panther link)

    Proteins where this domain is known:
    PF13_0033   


    PTHR23073:SF12 - PTHR23073:SF12 (Panther link)

    Proteins where this domain is known:
    PFL2345c   


    PTHR23073:SF13 - 26S PROTEASE REGULATORY SUBUNIT 7 (Panther link)

    Proteins where this domain is known:
    PF13_0063   


    PTHR23073:SF7 - PTHR23073:SF7 (Panther link)

    Proteins where this domain is known:
    PF11_0314   


    PTHR23073:SF8 - PTHR23073:SF8 (Panther link)

    Proteins where this domain is known:
    PFD0665c   


    PTHR23073:SF9 - PTHR23073:SF9 (Panther link)

    Proteins where this domain is known:
    PF10_0081   


    PTHR23074 - PTHR23074 (Panther link)

    Proteins where this domain is known:
    PF14_0548    PFD0385c   


    PTHR23074:SF3 - PTHR23074:SF3 (Panther link)

    Proteins where this domain is known:
    PF14_0548   


    PTHR23075 - PUTATIVE ATP-ASE (Panther link)

    Proteins where this domain is known:
    MAL7P1.209   


    PTHR23076 - PTHR23076 (Panther link)

    Proteins where this domain is known:
    MAL8P1.144    PF11_0203    PF14_0616    PFL1925w   


    PTHR23076:SF10 - PTHR23076:SF10 (Panther link)

    Proteins where this domain is known:
    PF14_0616    PFL1925w   


    PTHR23076:SF11 - PTHR23076:SF11 (Panther link)

    Proteins where this domain is known:
    PF11_0203   


    PTHR23077 - AAA-FAMILY ATPASE (Panther link)

    Proteins where this domain is known:
    MAL8P1.92    PF07_0047    PF08_0117    PF11_0405    PF14_0126    PFF0940c   


    PTHR23077:SF1 - PTHR23077:SF1 (Panther link)

    Proteins where this domain is known:
    MAL8P1.92   


    PTHR23077:SF10 - PTHR23077:SF10 (Panther link)

    Proteins where this domain is known:
    PF11_0405   


    PTHR23077:SF11 - PTHR23077:SF11 (Panther link)

    Proteins where this domain is known:
    PF14_0126   


    PTHR23077:SF18 - CELL DIVISION CONTROL PROTEIN 48 AAA FAMILY PROTEIN (TRANSITIONAL ENDOPLASMIC RETICULUM ATPASE) (Panther link)

    Proteins where this domain is known:
    PF07_0047    PFF0940c   


    PTHR23078 - PTHR23078 (Panther link)

    Proteins where this domain is known:
    PFC0140c   


    PTHR23081 - PTHR23081 (Panther link)

    Proteins where this domain is known:
    PF10_0124   


    PTHR23083 - PTHR23083 (Panther link)

    Proteins where this domain is known:
    PFF0080c    PFF1505w   


    PTHR23083:SF20 - PTHR23083:SF20 (Panther link)

    Proteins where this domain is known:
    PFF0080c   


    PTHR23084 - PTHR23084 (Panther link)

    Proteins where this domain is known:
    PF10_0101    PF10_0306    PF11_0307    PF14_0121    PF14_0586    PFB0230c    PFE0560c    PFE0735w   


    PTHR23084:SF11 - PTHR23084:SF11 (Panther link)

    Proteins where this domain is known:
    PFE0735w   


    PTHR23084:SF27 - PTHR23084:SF27 (Panther link)

    Proteins where this domain is known:
    PF10_0306   


    PTHR23084:SF28 - PTHR23084:SF28 (Panther link)

    Proteins where this domain is known:
    PFB0230c   


    PTHR23084:SF30 - PTHR23084:SF30 (Panther link)

    Proteins where this domain is known:
    PF14_0586   


    PTHR23084:SF8 - PTHR23084:SF8 (Panther link)

    Proteins where this domain is known:
    PF10_0101   


    PTHR23086 - PIP5K (Panther link)

    Interpro entry IPR002498 : Phosphatidylinositol-4-phosphate 5-kinase, core (Interpro link)

    Interpro description:
    This entry represents a conserved region from the common kinase core found in the type I phosphatidylinositol-4-phosphate 5-kinase (PIP5K) family as described in. This region is found in I, II and III phosphatidylinositol-4-phosphate 5-kinases (PIP5K enzymes). PIP5K catalyses the formation of phosphoinositol-4,5-bisphosphate via the phosphorylation of phosphatidylinositol-4-phosphate a precursor in the phosphinositide signalling pathway.

    Proteins where this domain is known:
    PFA0515w   


    PTHR23086:SF4 - PTHR23086:SF4 (Panther link)

    Proteins where this domain is known:
    PFA0515w   


    PTHR23089 - HIT (Panther link)

    Interpro entry IPR001310 : (Interpro link)

    Interpro description:

    The Histidine Triad (HIT) motif, His-phi-His-phi-His-phi-phi (phi, a hydrophobic amino acid) was identified as being highly conserved in a variety of organisms. Crystal structure of rabbit Hint, purified as an adenosine and AMP-binding protein, showed that proteins in the HIT superfamily are conserved as nucleotide-binding proteins and that Hint homologues, which are found in all forms of life, are structurally related to Fhit homologues and GalT-related enzymes, which have more restricted phylogenetic profiles. Hint homologues including rabbit Hint and yeast Hnt1 hydrolyse adenosine 5' monophosphoramide substrates such as AMP-NH2 and AMP-lysine to AMP plus the amine product and function as positive regulators of Cdk7/Kin28 in vivo. Fhit homologues are diadenosine polyphosphate hydrolases and function as tumour suppressors in human and mouse though the tumour suppressing function of Fhit does not depend on ApppA hydrolysis. The third branch of the HIT superfamily, which includes GalT homologues, contains a related His-X-His-X-Gln motif and transfers nucleoside monophosphate moieties to phosphorylated second substrates rather than hydrolysing them.

    Proteins where this domain is known:
    PF08_0059    PF14_0349   


    PTHR23090 - PTHR23090 (Panther link)

    Proteins where this domain is known:
    PFI1310w   


    PTHR23090:SF1 - PTHR23090:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1310w   


    PTHR23091 - N-TERMINAL ACETYLTRANSFERASE (Panther link)

    Proteins where this domain is known:
    MAL8P1.200    PF10_0036    PFA0465c   


    PTHR23091:SF3 - N-ACETYLTRANSFERASE MAK3 (Panther link)

    Proteins where this domain is known:
    MAL8P1.200   


    PTHR23091:SF4 - PTHR23091:SF4 (Panther link)

    Proteins where this domain is known:
    PF10_0036   


    PTHR23091:SF5 - PTHR23091:SF5 (Panther link)

    Proteins where this domain is known:
    PFA0465c   


    PTHR23092 - PTHR23092 (Panther link)

    Proteins where this domain is known:
    MAL13P1.170    PFL1325c   


    PTHR23092:SF5 - PTHR23092:SF5 (Panther link)

    Proteins where this domain is known:
    MAL13P1.170   


    PTHR23105 - PTHR23105 (Panther link)

    Proteins where this domain is known:
    PF07_0046    PF11_0250    PF13_0356    PF14_0231    PF14_0391    PFD0960c    PFL0500w   


    PTHR23105:SF1 - Ribosomal_L7A (Panther link)

    Interpro entry IPR001921 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The genomic structure and sequence of the human ribosomal protein L7a has been determined. The gene contains 8 exons and 7 introns, encompassing 3179 bp. The human gene resembles other mammalian ribosomal protein genes in so far as it contains a short first exon, a short 5' untranslated leader and its transcriptional start sites at C residues embedded in a poly-pyrimidine tract.

    The sequence of a gene for ribosomal protein L4 of Saccharomyces cerevisiae (Baker's yeast) has also been determined, which, unlike most of its other ribosomal protein genes, has no intron. The single open reading frame is highly similar to mammalian ribosomal protein L7a.

    There appear to be two genes for L4, both of which are active. Yeast cells containing a disruption of the L4-1 gene form smaller colonies than either wild-type or disrupted L4-2 strains. Disruption of both L4 genes is lethal, probably resulting from an inability of the organism to produce functional ribosomes.

    Several other ribosomal proteins have been found to share sequence similarity with L7a, including yeast NHP2, Bacillus subtilis hypothetical protein ylxQ, Haloarcula marismortui (Halobacterium marismortui) Hs6, and Methanocaldococcus jannaschii MJ1203.

    This InterPro entry focus on regions that characterise the ribosomal L7A proteins but distinguish them from the rest of the HMG-like family.

    Proteins where this domain is known:
    PF14_0231   


    PTHR23105:SF11 - PTHR23105:SF11 (Panther link)

    Proteins where this domain is known:
    PF11_0250   


    PTHR23105:SF12 - PTHR23105:SF12 (Panther link)

    Proteins where this domain is known:
    PFD0960c   


    PTHR23105:SF3 - PTHR23105:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0356   


    PTHR23105:SF4 - PTHR23105:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0391   


    PTHR23105:SF5 - Ribosom_L1_bac (Panther link)

    Proteins where this domain is known:
    PF07_0046    PFL0500w   


    PTHR23109 - PTHR23109 (Panther link)

    Proteins where this domain is known:
    PF11_0439   


    PTHR23114 - PTHR23114 (Panther link)

    Proteins where this domain is known:
    PF13_0048   


    PTHR23115 - PTHR23115 (Panther link)

    Proteins where this domain is known:
    MAL13P1.164    MAL13P1.243    PF07_0062    PF08_0018    PF10_0041    PF11_0245    PF13_0069    PF13_0304    PF13_0305    PF14_0104    PF14_0486    PFA0495c    PFE0830c    PFF0115c    PFF0345w    PFI0570w    PFI1505c    PFL1590c    PFL1710c   


    PTHR23115:SF13 - PTHR23115:SF13 (Panther link)

    Proteins where this domain is known:
    PFF0115c    PFL1590c    PFL1710c   


    PTHR23115:SF14 - PTHR23115:SF14 (Panther link)

    Proteins where this domain is known:
    PF07_0062   


    PTHR23115:SF15 - PTHR23115:SF15 (Panther link)

    Proteins where this domain is known:
    MAL13P1.243   


    PTHR23115:SF30 - PTHR23115:SF30 (Panther link)

    Proteins where this domain is known:
    PFA0495c   


    PTHR23115:SF31 - Transl_elong_EFTu/EF1A_bac/org (Panther link)

    Interpro entry IPR004541 : Translation elongation factor EFTu/EF1A, bacterial and organelle (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    EF1A (also known as EF-1alpha or EF-Tu) is a G-protein. It forms a ternary complex of EF1A-GTP-aminoacyltRNA. The binding of aminoacyl-tRNA stimulates GTP hydrolysis by EF1A, causing a conformational change in EF1A that causes EF1A-GDP to detach from the ribosome, leaving the aminoacyl-tRNA attached at the A-site. Only the cognate aminoacyl-tRNA can induce the required conformational change in EF1A through its tight anticodon-codon binding. EF1A-GDP is returned to its active state, EF1A-GTP, through the action of another elongation factor, EF1B (also known as EF-Ts or EF-1beta/gamma/delta).

    This entry represents EF1A (or EF-Tu) proteins found primarily in bacteria, mitochondria and chloroplasts. Eukaryotic and archaeal EF1A are excluded from this entry. When bound to GTP, EF-Tu can form a complex with any (correctly) aminoacylated tRNA except those for initiation and for selenocysteine, in which case EF-Tu is replaced by other factors.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    MAL13P1.164   


    PTHR23115:SF36 - PTHR23115:SF36 (Panther link)

    Proteins where this domain is known:
    PF11_0245   


    PTHR23115:SF37 - PTHR23115:SF37 (Panther link)

    Proteins where this domain is known:
    PF13_0304    PF13_0305   


    PTHR23115:SF38 - PTHR23115:SF38 (Panther link)

    Proteins where this domain is known:
    PFI1505c   


    PTHR23115:SF4 - PTHR23115:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0486   


    PTHR23115:SF40 - PTHR23115:SF40 (Panther link)

    Proteins where this domain is known:
    PFI0570w   


    PTHR23115:SF41 - aIF-2 (Panther link)

    Interpro entry IPR015760 : (Interpro link)

    Interpro description:

    Initiation factor 2 (IF-2) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-2 promotes the GTP-dependent binding of the initiator tRNA to the small subunit of the ribosome. IF-2 is a protein of about 70 to 95 kDa that contains a central GTP-binding domain flanked by a highly variable N-terminal domain and a more conserved C-terminal domain. Some members of this group undergo protein self splicing that involves a post-translational excision of the intein followed by peptide ligation.

    The function of IF-2 in facilitating the proper binding of initiator methionyl-tRNA to the ribosomal P site appears to be universally conserved, with an IF-2 homologue (aIF-2) present in archaea bacteria Methanopyrus kandleri.

    Proteins where this domain is known:
    PF08_0018    PF13_0069    PFE0830c    PFF0345w   


    PTHR23115:SF5 - PTHR23115:SF5 (Panther link)

    Proteins where this domain is known:
    PF10_0041   


    PTHR23115:SF9 - PTHR23115:SF9 (Panther link)

    Proteins where this domain is known:
    PF14_0104   


    PTHR23117 - PTHR23117 (Panther link)

    Proteins where this domain is known:
    PFI1420w   


    PTHR23117:SF1 - PTHR23117:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1420w   


    PTHR23125 - F-BOX/LEUCINE RICH REPEAT PROTEIN (Panther link)

    Proteins where this domain is known:
    PF11_0243   


    PTHR23125:SF24 - gb def: F-box domain, putative (Panther link)

    Proteins where this domain is known:
    PF11_0243   


    PTHR23127 - PTHR23127 (Panther link)

    Proteins where this domain is known:
    PF14_0174   


    PTHR23134 - PTHR23134 (Panther link)

    Proteins where this domain is known:
    MAL7P1.130   


    PTHR23137 - SFT2 (Panther link)

    Interpro entry IPR011691 : (Interpro link)

    Interpro description:
    This is a group of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments. The members of this entry also show four putative transmembrane regions.

    Proteins where this domain is known:
    PF13_0124   


    PTHR23138 - RAN BINDING PROTEIN (Panther link)

    Proteins where this domain is known:
    PFD0950w   


    PTHR23138:SF9 - RAN BINDING PROTEIN 1 (Panther link)

    Proteins where this domain is known:
    PFD0950w   


    PTHR23139 - PTHR23139 (Panther link)

    Proteins where this domain is known:
    PF14_0057    PF14_0656    PFC0865w   


    PTHR23139:SF8 - PTHR23139:SF8 (Panther link)

    Proteins where this domain is known:
    PFC0865w   


    PTHR23139:SF9 - PTHR23139:SF9 (Panther link)

    Proteins where this domain is known:
    PF14_0057    PF14_0656   


    PTHR23140 - PTHR23140 (Panther link)

    Proteins where this domain is known:
    PF14_0028   


    PTHR23142 - PRP38 (Panther link)

    Interpro entry IPR005037 : (Interpro link)

    Interpro description:

    Members of this family are related to the pre mRNA splicing factor PRP38 from yeast, therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation.

    Proteins where this domain is known:
    PF11_0336    PF14_0070   


    PTHR23143 - PTHR23143 (Panther link)

    Proteins where this domain is known:
    MAL8P1.152    PFD0560w   


    PTHR23143:SF7 - PTHR23143:SF7 (Panther link)

    Proteins where this domain is known:
    MAL8P1.152    PFD0560w   


    PTHR23147 - PTHR23147 (Panther link)

    Proteins where this domain is known:
    PF11_0279    PFE0160c   


    PTHR23148 - PTHR23148 (Panther link)

    Proteins where this domain is known:
    PFC0465c   


    PTHR23151 - PTHR23151 (Panther link)

    Proteins where this domain is known:
    PF10_0407    PF13_0121    PFC0170c   


    PTHR23151:SF10 - PTHR23151:SF10 (Panther link)

    Proteins where this domain is known:
    PFC0170c   


    PTHR23151:SF7 - PTHR23151:SF7 (Panther link)

    Proteins where this domain is known:
    PF10_0407   


    PTHR23151:SF8 - PTHR23151:SF8 (Panther link)

    Proteins where this domain is known:
    PF13_0121   


    PTHR23152 - 2oxoglutarate_DH_E1 (Panther link)

    Interpro entry IPR011603 : 2-oxoglutarate dehydrogenase, E1 component (Interpro link)

    Interpro description:

    2-oxoglutarate dehydrogenase is a key enzyme in the TCA cycle, converting 2-oxoglutarate, coenzyme A and NAD(+) to succinyl-CoA, NADH and carbon dioxide. This activity of this enzyme is tightly regulated and it is a major determinant of the metabolic flux through the TCA cycle. This enzyme is composed of multiple copies of three different subunits: 2-oxoglutarate dehydrogenase (E1), dihydrolipoamide succinyltransferase (E2) and lipoamide dehydrogenase (E3) which is often shared with similar enzymes such as pyruvate dehydrogenase. The E2 component forms a large multimeric core which binds the peripheral E1 and E3 subunits. The substrate is transferred between the active sites of the different subunits by a lipoyl moiety, bound to a lysine residue from the E2 polypeptide.

    This entry represents the E1 subunit of 2-oxoglutarate dehydrogenase. It catalyses the decarboxylation of this compound in a thiamine pyrophosphate-dependent manner, transferring the resultant succinyl group onto the liposyl moiety bound to the E2 subunit. The E1 ortholog from Corynebacterium glutamicum (Brevibacterium flavum) is unusual in having an N-terminal extension that resembles the E2 component of 2-oxoglutarate dehydrogenase enzyme.

    Proteins where this domain is known:
    PF08_0045   


    PTHR23172 - PTHR23172 (Panther link)

    Proteins where this domain is known:
    PF14_0111   


    PTHR23172:SF19 - PTHR23172:SF19 (Panther link)

    Proteins where this domain is known:
    PF14_0111   


    PTHR23180 - PTHR23180 (Panther link)

    Proteins where this domain is known:
    PF08_0120    PFE1305c    PFL2140c   


    PTHR23180:SF16 - PTHR23180:SF16 (Panther link)

    Proteins where this domain is known:
    PF08_0120    PFL2140c   


    PTHR23180:SF22 - GCN4-COMPLEMENTING PROTEIN (Panther link)

    Proteins where this domain is known:
    PFE1305c   


    PTHR23195 - YEATS (Panther link)

    Interpro entry IPR005033 : YEATS (Interpro link)

    Interpro description:

    Named the YEATS family, after 'YNK7', 'ENL', 'AF-9', and 'TFIIF small subunit', this family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity.

    Proteins where this domain is known:
    MAL8P1.131   


    PTHR23205 - PTHR23205 (Panther link)

    Proteins where this domain is known:
    PFF0970w   


    PTHR23213 - PTHR23213 (Panther link)

    Proteins where this domain is known:
    PFE1545c    PFL0925w   


    PTHR23213:SF21 - PTHR23213:SF21 (Panther link)

    Proteins where this domain is known:
    PFE1545c   


    PTHR23214 - PTHR23214 (Panther link)

    Proteins where this domain is known:
    PFL1835w   


    PTHR23215 - PTHR23215 (Panther link)

    Proteins where this domain is known:
    PF10_0091   


    PTHR23222 - Prohibitin (Panther link)

    Interpro entry IPR000163 : Prohibitin (Interpro link)

    Interpro description:

    Genes that negatively regulate proliferation inside the cell are of considerable interest because of the implications in processes such as development and cancer. Prohibitin, a novel cytoplasmic anti-proliferative protein widely expressed in a variety of tissues, inhibits DNA synthesis. Studies have suggested that prohibitin may be a suppressor gene and is associated with tumour development and/or progression of at least some breast cancers. Sequence comparisons suggest that the prohibitin gene is an analogue of Cc, a Drosophila melanogaster gene that is vital for normal development.

    Proteins where this domain is known:
    PF08_0006    PF10_0144   


    PTHR23224 - ZINC FINGER PROTEINS (Panther link)

    Proteins where this domain is known:
    MAL7P1.125    PFL0465c   


    PTHR23224:SF461 - PTHR23224:SF461 (Panther link)

    Proteins where this domain is known:
    MAL7P1.125   


    PTHR23224:SF6 - ZINC FINGER PROTEIN-RELATED (Panther link)

    Proteins where this domain is known:
    PFL0465c   


    PTHR23230 - Kelch_related (Panther link)

    Interpro entry IPR013089 : (Interpro link)

    Interpro description:

    Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.

    The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.

    This entry represents Kelch-related domains, including the BTB (broad-complex, tramtrack and bric a brac) domain, which defines a family of proteins involved in diverse biological processes. BTB proteins are divided into subgroups depending on what domain lies at the C-terminus. BTB-Kelch proteins have Kelch repeats that form a beta-propeller that can interact with actin filaments.

    Proteins where this domain is known:
    MAL7P1.137    PF13_0238   


    PTHR23230:SF184 - PTHR23230:SF184 (Panther link)

    Proteins where this domain is known:
    MAL7P1.137    PF13_0238   


    PTHR23237 - Gar1_RNA_bd (Panther link)

    Interpro entry IPR007504 : Gar1 protein RNA-binding region (Interpro link)

    Interpro description:
    Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex. In humans, Gar1 is also component of telomerase in vivo.

    Proteins where this domain is known:
    PF13_0051   


    PTHR23240 - PTHR23240 (Panther link)

    Proteins where this domain is known:
    PF14_0711   


    PTHR23240:SF1 - PTHR23240:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0711   


    PTHR23244 - PTHR23244 (Panther link)

    Proteins where this domain is known:
    PF10_0219    PF11_0267    PF11_0268    PFL0270c    PFL0650c   


    PTHR23244:SF33 - PTHR23244:SF33 (Panther link)

    Proteins where this domain is known:
    PFL0270c   


    PTHR23244:SF38 - PTHR23244:SF38 (Panther link)

    Proteins where this domain is known:
    PF10_0219   


    PTHR23245 - PTHR23245 (Panther link)

    Proteins where this domain is known:
    PFI0700c   


    PTHR23245:SF25 - PTHR23245:SF25 (Panther link)

    Proteins where this domain is known:
    PFI0700c   


    PTHR23248 - Scramblase (Panther link)

    Interpro entry IPR005552 : (Interpro link)

    Interpro description:
    Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury.

    Proteins where this domain is known:
    PF10_0220   


    PTHR23249 - Sybindin (Panther link)

    Interpro entry IPR007233 : Sybindin-like protein (Interpro link)

    Interpro description:
    Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses. Syndecan-2 induces spine formation by recruiting intracellular vesicles toward postsynaptic sites through the interaction with synbindin.

    Proteins where this domain is known:
    PF14_0049    PFC0445w   


    PTHR23249:SF2 - PTHR23249:SF2 (Panther link)

    Proteins where this domain is known:
    PF14_0049    PFC0445w   


    PTHR23252 - PTHR23252 (Panther link)

    Proteins where this domain is known:
    PFL0655w   


    PTHR23252:SF2 - PTHR23252:SF2 (Panther link)

    Proteins where this domain is known:
    PFL0655w   


    PTHR23253 - EUKARYOTIC TRANSLATION INITIATION FACTOR 4 GAMMA (Panther link)

    Proteins where this domain is known:
    MAL13P1.63   


    PTHR23253:SF7 - EUKARYOTIC TRANSLATION INITIATION FACTOR 4G (Panther link)

    Proteins where this domain is known:
    MAL13P1.63   


    PTHR23256 - PTHR23256 (Panther link)

    Proteins where this domain is known:
    MAL7P1.92    PFL0410w   


    PTHR23256:SF279 - PTHR23256:SF279 (Panther link)

    Proteins where this domain is known:
    MAL7P1.92    PFL0410w   


    PTHR23257 - PTHR23257 (Panther link)

    Proteins where this domain is known:
    PF11_0079    PFB0520w    PFF1145c    PFI1275w   


    PTHR23258 - PTHR23258 (Panther link)

    Proteins where this domain is known:
    PF10_0002    PF11_0220    PFL0010c    PFL2630w   


    PTHR23264 - PTHR23264 (Panther link)

    Proteins where this domain is known:
    PF11_0296    PFI0525w   


    PTHR23264:SF2 - PTHR23264:SF2 (Panther link)

    Proteins where this domain is known:
    PF11_0296   


    PTHR23264:SF5 - PTHR23264:SF5 (Panther link)

    Proteins where this domain is known:
    PFI0525w   


    PTHR23270 - PTHR23270 (Panther link)

    Proteins where this domain is known:
    PF14_0042   


    PTHR23273 - PTHR23273 (Panther link)

    Proteins where this domain is known:
    PFD0470c    PFI0235w   


    PTHR23289 - PTHR23289 (Panther link)

    Proteins where this domain is known:
    PF14_0331   


    PTHR23291 - UPF0005 (Panther link)

    Interpro entry IPR006214 : (Interpro link)

    Interpro description:

    This family of proteins of unknown function contains a subset of Bax inhibitor-1 proteins.

    Proteins where this domain is known:
    PF14_0571    PFL2325c   


    PTHR23291:SF11 - PTHR23291:SF11 (Panther link)

    Proteins where this domain is known:
    PFL2325c   


    PTHR23291:SF4 - PTHR23291:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0571   


    PTHR23293 - PTHR23293 (Panther link)

    Proteins where this domain is known:
    PF10_0147   


    PTHR23305 - PTHR23305 (Panther link)

    Proteins where this domain is known:
    MAL7P1.122   


    PTHR23305:SF3 - PTHR23305:SF3 (Panther link)

    Proteins where this domain is known:
    MAL7P1.122   


    PTHR23308 - PTHR23308 (Panther link)

    Proteins where this domain is known:
    PF13_0042   


    PTHR23308:SF1 - PTHR23308:SF1 (Panther link)

    Proteins where this domain is known:
    PF13_0042   


    PTHR23310 - ACBP (Panther link)

    Interpro entry IPR000582 : Acyl-CoA-binding protein, ACBP (Interpro link)

    Interpro description:

    Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.

    ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.

    Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.

    The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.

    Other proteins containing an ACB domain include:

    Proteins where this domain is known:
    PF08_0099    PF10_0015    PF10_0016    PF14_0749   


    PTHR23310:SF5 - PTHR23310:SF5 (Panther link)

    Proteins where this domain is known:
    PF08_0099    PF10_0015    PF10_0016   


    PTHR23313 - PTHR23313 (Panther link)

    Proteins where this domain is known:
    PF14_0593   


    PTHR23314 - PTHR23314 (Panther link)

    Proteins where this domain is known:
    PF11_0318   


    PTHR23316 - IMPORTIN ALPHA (Panther link)

    Proteins where this domain is known:
    PF08_0087    PF14_0540   


    PTHR23320 - PTHR23320 (Panther link)

    Proteins where this domain is known:
    MAL13P1.261    PF10_0319   


    PTHR23320:SF1 - PTHR23320:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.261    PF10_0319   


    PTHR23321 - PTHR23321 (Panther link)

    Proteins where this domain is known:
    PF11_0072    PF13_0059   


    PTHR23321:SF3 - PTHR23321:SF3 (Panther link)

    Proteins where this domain is known:
    PF13_0059   


    PTHR23321:SF7 - PTHR23321:SF7 (Panther link)

    Proteins where this domain is known:
    PF11_0072   


    PTHR23322 - PTHR23322 (Panther link)

    Proteins where this domain is known:
    PF11_0253    PFI1680w   


    PTHR23322:SF1 - PTHR23322:SF1 (Panther link)

    Proteins where this domain is known:
    PFI1680w   


    PTHR23322:SF6 - PTHR23322:SF6 (Panther link)

    Proteins where this domain is known:
    PF11_0253   


    PTHR23323 - PTHR23323 (Panther link)

    Proteins where this domain is known:
    PF11_0262    PF13_0053    PFE0100w   


    PTHR23323:SF24 - PTHR23323:SF24 (Panther link)

    Proteins where this domain is known:
    PF11_0262   


    PTHR23323:SF25 - PTHR23323:SF25 (Panther link)

    Proteins where this domain is known:
    PF13_0053   


    PTHR23324 - PTHR23324 (Panther link)

    Proteins where this domain is known:
    PFF1450w   


    PTHR23326 - PTHR23326 (Panther link)

    Proteins where this domain is known:
    PF10_0062    PF11_0297   


    PTHR23327 - PTHR23327 (Panther link)

    Proteins where this domain is known:
    PF11_0244   


    PTHR23329 - PTHR23329 (Panther link)

    Proteins where this domain is known:
    PFE1570c   


    PTHR23329:SF1 - PTHR23329:SF1 (Panther link)

    Proteins where this domain is known:
    PFE1570c   


    PTHR23333 - PTHR23333 (Panther link)

    Proteins where this domain is known:
    MAL8P1.122   


    PTHR23333:SF3 - PTHR23333:SF3 (Panther link)

    Proteins where this domain is known:
    MAL8P1.122   


    PTHR23338 - PTHR23338 (Panther link)

    Proteins where this domain is known:
    PF11_0266    PF11_0524    PFI0475w   


    PTHR23338:SF16 - PTHR23338:SF16 (Panther link)

    Proteins where this domain is known:
    PF11_0524   


    PTHR23338:SF17 - PTHR23338:SF17 (Panther link)

    Proteins where this domain is known:
    PFI0475w   


    PTHR23338:SF18 - PTHR23338:SF18 (Panther link)

    Proteins where this domain is known:
    PF11_0266   


    PTHR23339 - TYROSINE SPECIFIC PROTEIN PHOSPHATASE AND DUAL SPECIFICITY PROTEIN PHOSPHATASE (Panther link)

    Proteins where this domain is known:
    PF11_0139   


    PTHR23339:SF23 - PROTEIN TYROSINE PHOSPHATASE PRL (Panther link)

    Proteins where this domain is known:
    PF11_0139   


    PTHR23344 - GDPD (Panther link)

    Interpro entry IPR004129 : Glycerophosphoryl diester phosphodiesterase (Interpro link)

    Interpro description:
    Glycerophosphoryl diester phosphodiesterases display broad specificity for glycerophosphodiesters; glycerophosphocholine, glycerophosphoethanolamine, glycerophosphoglycerol, and bis(glycerophosphoglycerol) all of which are are hydrolysed by this enzyme.

    Proteins where this domain is known:
    PF14_0060   


    PTHR23346 - TRANSLATIONAL ACTIVATOR GCN1-RELATED (Panther link)

    Proteins where this domain is known:
    MAL13P1.26   


    PTHR23346:SF4 - TRANSLATIONAL ACTIVATOR GCN1-RELATED (Panther link)

    Proteins where this domain is known:
    MAL13P1.26   


    PTHR23354 - NUCLEOLAR PROTEIN 7/ESTROGEN RECEPTOR COACTIVATOR-RELATED (Panther link)

    Proteins where this domain is known:
    MAL13P1.395   


    PTHR23355 - PTHR23355 (Panther link)

    Proteins where this domain is known:
    MAL13P1.289    PFF0745c    PFI0295c   


    PTHR23355:SF12 - PTHR23355:SF12 (Panther link)

    Proteins where this domain is known:
    MAL13P1.289   


    PTHR23359 - Adenylate_kin (Panther link)

    Interpro entry IPR000850 : Adenylate kinase (Interpro link)

    Interpro description:
    Adenylate kinases (ADK) are phosphotransferases that catalyse the reversible reaction
     AMP + MgATP = ADP + MgADP 
    an essential reaction for many processes in living cells. Two ADK isozymes have been identified in mammalian cells. These specifically bind AMP and favour binding to ATP over other nucleotide triphosphates (AK1 is cytosolic and AK2 is located in the mitochondria). A third ADK has been identified in bovine heart and human cells, this is a mitochondrial GTP:AMP phosphotransferase, also specific for the phosphorylation of AMP, but can only use GTP or ITP as a substrate. ADK has also been identified in different bacterial species and in yeast . Two further enzymes are known to be related to the ADK family, i.e. yeast uridine monophosphokinase and slime mold UMP-CMP kinase. Within the ADK family there are several conserved regions, including the ATP-binding domains. One of the most conserved areas includes an Arg residue, whose modification inactivates the enzyme, together with an Asp that resides in the catalytic cleft of the enzyme and participates in a salt bridge.

    Proteins where this domain is known:
    PF08_0062    PF10_0086    PFA0555c    PFD0755c   


    PTHR23359:SF22 - ADENYLATE KINASE (Panther link)

    Proteins where this domain is known:
    PF08_0062    PF10_0086    PFD0755c   


    PTHR23359:SF23 - PTHR23359:SF23 (Panther link)

    Proteins where this domain is known:
    PFA0555c   


    PTHR23365 - PTHR23365 (Panther link)

    Proteins where this domain is known:
    PF14_0433    PFF0300w    PFI1175c    PFI1600w   


    PTHR23389 - PTHR23389 (Panther link)

    Proteins where this domain is known:
    PFA0545c    PFB0895c   


    PTHR23405 - PTHR23405 (Panther link)

    Proteins where this domain is known:
    PFB0175c   


    PTHR23405:SF4 - Mak16 (Panther link)

    Interpro entry IPR006958 : (Interpro link)

    Interpro description:
    The function of these proteins is unknown. The yeast orthologues have been implicated in cell cycle progression and biogenesis of 60S ribosomal subunits. The Schistosoma mansoni (Blood fluke) Mak16 has been shown to target protein transport to the nucleolus.

    Proteins where this domain is known:
    PFB0175c   


    PTHR23407 - PTHR23407 (Panther link)

    Proteins where this domain is known:
    PFL2160c   


    PTHR23409 - Ribonucl_redctse (Panther link)

    Interpro entry IPR000358 : Ribonucleotide reductase (Interpro link)

    Interpro description:

    Ribonucleotide reductase catalyzes the reductive synthesis of deoxyribonucleotides from their corresponding ribonucleotides:

     2'-deoxyribonucleoside diphosphate + oxidized thioredoxin + H2O = ribonucleoside diphosphate + reduced thioredoxin 
    It provides the precursors necessary for DNA synthesis. RNRs divide into three classes on the basis of their metallocofactor usage. Class I RNRs, found in eukaryotes, bacteria, bacteriophage and viruses, use a diiron-tyrosyl radical, Class II RNRs, found in bacteria, bacteriophage, algae and archaea, use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs, found in anaerobic bacteria and bacteriophage, use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. Many organisms have more than one class of RNR present in their genomes.

    Ribonucleotide reductase is an oligomeric enzyme composed of a large subunit (700 to 1000 residues) and a small subunit (300 to 400 residues) - class II RNRs are less complex, using the small molecule B12 in place of the small chain. The small chain binds two iron atoms (three Glu, one Asp, and two His are involved in metal binding) and contains an active site tyrosine radical. The regions of the sequence that contain the metal-binding residues and the active site tyrosine are conserved in ribonucleotide reductase small chain from prokaryotes, eukaryotes and viruses. We have selected one of these regions as a signature pattern. It contains the active site residue as well as a glutamate and a histidine involved in the binding of iron.

    Proteins where this domain is known:
    PF10_0154    PF14_0053   


    PTHR23409:SF5 - RIBONUCLEOSIDE-DIPHOSPHATE REDUCTASE SMALL CHAIN (Panther link)

    Proteins where this domain is known:
    PF10_0154   


    PTHR23410 - Ribosomal_L5euk (Panther link)

    Interpro entry IPR005485 : Ribosomal protein L5, eukaryotic (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family consists of ribosomal protein L5 from eukaryotes. The ribosomal 5S RNA is the only known rRNA species to bind a ribosomal protein before its assembly into the ribosomal subunits . In eukaryotes, the 5S rRNA molecule binds one protein species, a 34-kDa protein which has been implicated in the intracellular transport of 5 S rRNA..

    Proteins where this domain is known:
    PF14_0230   


    PTHR23410:SF1 - PTHR23410:SF1 (Panther link)

    Proteins where this domain is known:
    PF14_0230   


    PTHR23413 - PTHR23413 (Panther link)

    Proteins where this domain is known:
    PF07_0027    PFI0190w   


    PTHR23413:SF1 - Ribosomal_L32E (Panther link)

    Interpro entry IPR001515 : Ribosomal protein L32e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The L32e family consists of proteins that have 135 to 240 amino-acid residues.

    Proteins where this domain is known:
    PFI0190w   


    PTHR23413:SF2 - RNA_pol_N/8_sub (Panther link)

    Interpro entry IPR000268 : RNA polymerases, N/8 Kd subunits (Interpro link)

    Interpro description:
    In eukaryotes, there are three different forms of DNA-dependent RNA polymerases transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. Archaebacterial subunit N (gene rpoN) is a small protein of about 8 kDa, it is evolutionary related to a 8.3 kDa component shared by all three forms of eukaryotic RNA polymerases (gene RPB10 in yeast and POLR2J in mammals) as well as to African swine fever virus (ASFV) protein CP80R. There is a conserved region which is located at the N-terminal extremity of these polymerase subunits; this region contains two cysteines that binds a zinc ion.

    Proteins where this domain is known:
    PF07_0027   


    PTHR23415 - PTHR23415 (Panther link)

    Proteins where this domain is known:
    PF14_0635   


    PTHR23415:SF4 - PTHR23415:SF4 (Panther link)

    Proteins where this domain is known:
    PF14_0635   


    PTHR23417 - PTHR23417 (Panther link)

    Proteins where this domain is known:
    PF11_0284   


    PTHR23417:SF1 - Methyltransf_4 (Panther link)

    Interpro entry IPR003358 : tRNA (guanine-N-7) methyltransferase (Interpro link)

    Interpro description:

    This entry represents tRNA (guanine-N-7) methyltransferase, which catalyses the formation of N(7)-methylguanine at position 46 (m7G46) in tRNA. Capping of the pre-mRNA 5' end by addition a monomethylated guanosine cap (m(7)G) is an essential and the earliest modification in the biogenesis of mRNA. The reaction is catalysed by three enzymes: triphosphatase, guanylyltransferase, and tRNA (guanine-N-7) methyltransferase.

    Proteins where this domain is known:
    PF11_0284   


    PTHR23419 - Ion_tolerance_CutA1 (Panther link)

    Interpro entry IPR004323 : Divalent ion tolerance protein, CutA1 (Interpro link)

    Interpro description:

    CutA1 is a widespread protein of about 12 kDa found in bacteria, plants, and animals, including humans. The protein was originally identified in a gene locus of Escherichia coli called cutA involved in divalent metal toleranc. The cutA locus consists of two operons, one containing a single gene encoding a cytoplasmic protein, CutA1, and the other composed of two genes encoding a 50-kDa (CutA2) and a 24-kDa (CutA3) inner membrane proteins. Molecular genetics studies on the E. coli cutA locus showed that some mutations lead to copper sensitivity due to its increased uptake. However, the specific function of CutA1 in E. coli is still unknown.

    However, a possible role of mammalian CutA1 in the anchoring of the enzyme acetylcholinesterase (AChE)1 in neuronal cell membranes. CutA1 does not directly interact with AChE, but the CutA1 gene is widely expressed in different regions of the brain with an expression pattern that parallels that of AChE. In addition CutA1 Co-purified with AChE from human caudate nucleus. CutA1, thus, might provide an intriguing link between copper tolerance in bacteria and a complex process in the brain of the most evolved organisms.

    Both rat and E. coli CutA1 have been crystallised. Both proteins are trimeric in the crystals and in solution through an inter-subunit beta-sheet formation. Each monomer exhibits the same overall structure, adopting a ferredoxin-like fold made of an alpha-beta sandwich with antiparallel beta-sheet and containing an additional short strand and a C-terminal helix. In the beta-sheet, alternate strands are connected by helices with positive crossovers, resulting in a double beta-alpha-beta motif where the antiparallel beta-sheet packs against antiparallel alpha-helices. The C-terminal helix packs orthogonal to the N terminus.

    The strong structure similarity of CutA1 with PII proteins might point to an role for CutA1 in signalling through allosteric communication between monomers. CutA1 may be involved in the tuning of a disulphide bond cascade in bacteria and mammals, acting as the PII proteins do in the nitrogen signal cascade in bacteria and plants.

    Proteins where this domain is known:
    PFL2375c   


    PTHR23420 - Ad_hcy_hydrolase (Panther link)

    Interpro entry IPR000043 : S-adenosyl-L-homocysteine hydrolase (Interpro link)

    Interpro description:
    S-adenosyl-L-homocysteine hydrolase (AdoHcyase) is an enzyme of the activated methyl cycle, responsible for the reversible hydration of S-adenosyl-L-homocysteine into adenosine and homocysteine. AdoHcyase is an ubiquitous enzyme which binds and requires NAD+ as a cofactor. AdoHcyase is a highly conserved protein of about 430 to 470 amino acids. The family contains a glycine-rich region in the central part of AdoHcyase; a region thought to be involved in NAD-binding.

    Proteins where this domain is known:
    PFE1050w   


    PTHR23421 - Glyco_hydro_35 (Panther link)

    Interpro entry IPR001944 : Glycoside hydrolase, family 35 (Interpro link)

    Interpro description:

    O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.

    Glycoside hydrolase family 35comprises enzymes with only one known activity; beta-galactosidase.

    Mammalian beta-galactosidase is a lysosomal enzyme (gene GLB1) which cleaves the terminal galactose from gangliosides, glycoproteins, and glycosaminoglycans and whose deficiency is the cause of the genetic disease Gm(1) gangliosidosis (Morquio disease type B).

    Proteins where this domain is known:
    MAL13P1.258   


    PTHR23422 - PTHR23422 (Panther link)

    Proteins where this domain is known:
    MAL13P1.248   


    PTHR23422:SF1 - PTHR23422:SF1 (Panther link)

    Proteins where this domain is known:
    MAL13P1.248   


    PTHR23426 - FERREDOXIN/ADRENODOXIN (Panther link)

    Proteins where this domain is known:
    PFL0705c   


    PTHR23426:SF1 - ADRENODOXIN (Panther link)

    Proteins where this domain is known:
    PFL0705c   


    PTHR23428 - Histone_H2B (Panther link)

    Interpro entry IPR000558 : Histone H2B (Interpro link)

    Interpro description:
    Histone H2B is one of the four histones, along with H2A, H3 and H4, which forms the eukaryotic nucleosome core. Histone H2B is a small, highly conserved nuclear protein that, together with 2 molecules each of histones H2A, H3 and H4, forms the eukaryotic nucleosome core; the nucleosome octamer winds ~146 DNA base-pairs.

    Proteins where this domain is known:
    PF07_0054    PF11_0062   


    PTHR23429 - G6PDH (Panther link)

    Interpro entry IPR001282 : Glucose-6-phosphate dehydrogenase (Interpro link)

    Interpro description:

    Glucose-6-phosphate dehydrogenase (G6PDH) is a ubiquitous protein, present in bacteria and all eukaryotic cell types. The enzyme catalyses the the first step in the pentose pathway, i.e. the conversion of glucose-6-phosphate to gluconolactone 6-phosphate in the presence of NADP, producing NADPH. The ubiquitous expression of the enzyme gives it a major role in the production of NADPH for the many NADPH-mediated reductive processes in all cells. Deficiency of G6PDH is a common genetic abnormality affecting millions of people worldwide. Many sequence variants, most caused by single point mutations, are known, exhibiting a wide variety of phenotypes.

    Proteins where this domain is known:
    PF14_0511   


    PTHR23430 - Histone_H2A (Panther link)

    Interpro entry IPR002119 : Histone H2A (Interpro link)

    Interpro description:
    Histone H2A is a small, highly conserved nuclear protein that, together with 2 molecules each of histones H2B, H3 and H4, forms the eukaryotic nucleosome core; the nucleosome octamer winds ~146 DNA base-pairs.

    Proteins where this domain is known:
    PFC0920w    PFF0860c   


    SM00014 - acidPPc (Smart link)

    Interpro entry IPR000326 : Phosphatidic acid phosphatase type 2/haloperoxidase (Interpro link)

    Interpro description:

    This entry represents type 2 phosphatidic acid phosphatase (PAP2; enzymes, such as phosphatidylglycerophosphatase Bfrom Escherichia coli. PAP2 enzymes have a core structure consisting of a 5-helical bundle, where the beginning of the third helix binds the cofactor. PAP2 enzymes catalyse the dephosphorylation of phosphatidate, yielding diacylglycerol and inorganic phosphate. In eukaryotic cells, PAP activity has a central role in the synthesis of phospholipids and triacylglycerol through its product diacylglycerol, and it also generates and/or degrades lipid-signalling molecules that are related to phosphatidate.

    Other related enzymes have a similar core structure, including haloperoxidases such as bromoperoxidase (contains one core bundle, but forms a dimer), chloroperoxidases (contains two core bundles arranged as in other family dimers), bacitracin transport permease from Bacillus licheniformis, glucose-6-phosphatase from rat. The vanadium-dependent haloperoxidases exclusively catalyse the oxidation of halides, and act as histidine phosphatases, using histidine for the nucleophilic attack in the first step of the reaction. Amino acid residues involved in binding phosphate/vanadate are conserved between the two families, supporting a proposal that vanadium passes through a tetrahedral intermediate during the reaction mechanism.

    Proteins where this domain is known:
    MAL8P1.202   


    SM00015 - IQ (Smart link)

    Interpro entry IPR000048 : (Interpro link)

    Interpro description:

    Calmodulin (CaM) is recognized as a major calcium sensor and orchestrator of regulatory events through its interaction with a diverse group of cellular proteins. Three classes of recognition motifs exist for many of the known CaM binding proteins; the IQ motif as a consensus for Ca2+-independent binding and two related motifs for Ca2+-dependent binding, termed 18-14 and 1-5-10 based on the position of conserved hydrophobic residues.

    The regulatory domain of scallop myosin is a three-chain protein complex that switches on this motor in response to Ca2+ binding. Side-chain interactions link the two light chains in tandem to adjacent segments of the heavy chain bearing the IQ-sequence motif. The Ca2+-binding site is a novel EF-hand motif on the essential light chain and is stabilized by linkages involving the heavy chain and both light chains, accounting for the requirement of all three chains for Ca2+ binding and regulation in the intact myosin molecule.

    Proteins where this domain is known:
    MAL13P1.148    PF11_0416    PF11_0540    PF14_0224    PFL0975w   


    SM00025 - Pumilio (Smart link)

    Interpro entry IPR001313 : Pumilio RNA-binding region (Interpro link)

    Interpro description:

    The drosophila pumilio gene codes for an unusual protein that binds through the Puf domain that usually occurs as a tandem repeat of eight domains. The FBF-2 protein of Caenorhabditis elegans also has a Puf domain. Both proteins function as translational repressors in early embryonic development by binding sequences in the 3' UTR of target mRNAs. The same type of repetitive domain has been found in in a number of other proteins from all eukaryotic kingdoms. The Puf proteins characterised to date have been reported to bind to 3'-untranslated region (UTR) sequences encompassing a so-called UGUR tetranucleotide motif and thereby to repress gene expression by affecting mRNA translation or stability.

    In Saccharomyces cerevisiae (Baker's yeast), five proteins, termed Puf1p to Puf5p, bear six to eight Puf repeats. Puf3p binds nearly exclusively to cytoplasmic mRNAs that encode mitochondrial proteins; Puf1p and Puf2p interact preferentially with mRNAs encoding membrane-associated proteins; Puf4p preferentially binds mRNAs encoding nucleolar ribosomal RNA-processing factors; and Puf5p is associated with mRNAs encoding chromatin modifiers and components of the spindle pole body. This suggests the existence of an extensive network of RNA-protein interactions that coordinate the post-transcriptional fate of large sets of cytotopically and functionally related RNAs through each stage of its lifecycle.

    Proteins where this domain is known:
    PFD0825c    PFE0935c   


    SM00027 - EH (Smart link)

    Interpro entry IPR000261 : (Interpro link)

    Interpro description:

    The EH (for Eps15 Homology) domain is a protein-protein interaction module of approximately 95 residues which was originally identified as a repeated sequence present in three copies at the N-terminus of the tyrosine kinase substrates Eps15 and Eps15R . The EH domain was subsequently found in several proteins implicated in endocytosis, vesicle transport and signal transduction in organisms ranging from yeast to mammals. EH domains are present in one to three copies and they may include calcium-binding domains of the EF-hand type. Eps15 is divided into three domains: domain I contains signatures of a regulatory domain, including a candidate tyrosine phosphorylation site and EF-hand-type calcium-binding domains, domain II presents the characteristic heptad repeats of coiled-coil rod-like proteins, and domain III displays a repeated aspartic acid-proline-phenylalanine motif similar to a consensus sequence of several methylases.

    EH domains have been shown to bind specifically but with moderate affinity to peptides containing short, unmodified motifs through predominantly hydrophobic interactions. The target motifs are divided into three classes: class I consists of the concensus Asn-Pro-Phe (NPF) sequence; class II consists of aromatic and hydrophobic di- and tripeptide motifs, including the Phe-Trp (FW), Trp-Trp (WW), and Ser-Trp-Gly (SWG) motifs; and class III contains the His-(Thr/Ser)-Phe motif (HTF/HSF). The structure of several EH domains has been solved by NMR spectroscopy. The fold consists of two helix-loop-helix characteristic of EF-hand domains, connected by a short antiparallel beta-sheet. The target peptide is bound in a hydrophobic pocket between two alpha helices. Sequence analysis and structural data indicate that not all the EF-hands are capable of binding calcium because of substitutions of the calcium-liganding residues in the loop.

    This domain is often implicated in the regulation of protein transport/sorting and membrane trafficking. Messenger RNA translation initiation and cytoplasmic poly(A) tail shortening require the poly(A)-binding protein (PAB) in yeast. The PAB-dependent poly(A) ribonuclease (PAN) is organised into distinct domains containing repeated sequence elements.

    Proteins where this domain is known:
    PF10_0244   


    SM00028 - TPR (Smart link)

    Interpro entry IPR013026 : (Interpro link)

    Interpro description:

    The tetratrico peptide repeat region (TPR) is a structural motif present in a wide range of proteins. It mediates proteinÂprotein interactions and the assembly of multiprotein complexes. The TPR motif consists of 3Â16 tandem-repeats of 34 amino acids residues, although individual TPR motifs can be dispersed in the protein sequence. Sequence alignment of the TPR domains reveals a consensus sequence defined by a pattern of small and large amino acids. TPR motifs have been identified in various different organisms, ranging from bacteria to humans. Proteins containing TPRs are involved in a variety of biological processes, such as cell cycle regulation, transcriptional control, mitochondrial and peroxisomal protein transport, neurogenesis and protein folding.

    The X-ray structure of a domain containing three TPRs from protein phosphatase 5 revealed that TPR adopts a helixÂturnÂhelix arrangement, with adjacent TPR motifs packing in a parallel fashion, resulting in a spiral of repeating anti-parallel alpha-helices. The two helices are denoted helix A and helix B. The packing angle between helix A and helix B is ~24° within a single TPR and generates a right-handed superhelical shape. Helix A interacts with helix B and with helix A' of the next TPR. Two protein surfaces are generated: the inner concave surface is contributed to mainly by residue on helices A, and the other surface presents residues from both helices A and B.

    Proteins where this domain is known:
    MAL13P1.274    MAL13P1.52    PF07_0026    PF14_0031    PF14_0324    PFC0515c    PFE1545c    PFF0080c    PFF0490w    PFF1505w    PFI1060w    PFL2015w    PFL2120w    PFL2275c   


    SM00029 - GASTRIN (Smart link)

    Interpro entry IPR001651 : Gastrin/cholecystokinin peptide hormone (Interpro link)

    Interpro description:
    Gastrin and cholecystokinin (CCK) are structurally and functionally related peptide hormones that function as hormonal regulators of various digestive processes and feeding behaviors. They are known to induce gastric secretion, stimulate pancreatic secretion, increase blood circulation and water secretion in the stomach and intestine, and stimulate smooth muscle contraction. Originally found in the gut, these hormones have since been shown to be present in various parts of the nervous system. Like many other active peptides they are synthesized as larger protein precursors that are enzymatically converted to their mature forms. They are found in several molecular forms due to tissue-specific post-translational processing. The biological activity of gastrin and CCK is associated with the last five C-terminal residues. One or two positions downstream, there is a conserved sulphated tyrosine residue. The amphibian caerulein skin peptide, the cockroach leukosulphakinin I and II (LSK) peptides, Drosophila melanogaster (Fruit fly) putative CCK-homologs Drosulphakinins I and II, cionin, a Gallus gallus (Chicken) gastrin/cholecystokinin-like peptide and cionin, a neuropeptide from the protochordate Ciona intestinalis belong to the same family.

    Proteins where this domain is known:
    PFE0070w   


    SM00032 - CCP (Smart link)

    Interpro entry IPR000436 : (Interpro link)

    Interpro description:

    Sushi domains are also known as Complement control protein (CCP) modules, or short consensus repeats (SCR), exist in a wide variety of complement and adhesion proteins. The structure is known for this domain, it is based on a beta-sandwich arrangement; one face made up of three beta-strands hydrogen-bonded to form a triple-stranded region at its centre and the other face formed from two separate beta-strands.

    CD21 (also called C3d receptor, CR2, Epstein Barr virus receptor or EBV-R) is the receptor for EBV and for C3d, C3dg and iC3b. Complement components may activate B cells through CD21. CD21 is part of a large signal-transduction complex that also involves CD19, CD81, and Leu13.

    Some of the proteins in this group are responsible for the molecular basis of the blood group antigens, surface markers on the outside of the red blood cell membrane. Most of these markers are proteins, but some are carbohydrates attached to lipids or proteins [Reid M.E., Lomas-Francis C. The Blood Group Antigen FactsBook Academic Press, London / San Diego, (1997)]. Complement decay-accelerating factor (Antigen CD55) belongs to the Cromer blood group system and is associated with Cr(a), Dr(a), Es(a), Tc(a/b/c), Wd(a), WES(a/b), IFC and UMC antigens. Complement receptor type 1 (C3b/C4b receptor) (Antigen CD35) belongs to the Knops blood group system and is associated with Kn(a/b), McC(a), Sl(a) and Yk(a) antigens.

    CD molecules are leucocyte antigens on cell surfaces. CD antigens nomenclature is updated at Protein Reviews On The Web (http://mpr.nci.nih.gov/prow/).

    Proteins where this domain is known:
    PFD0295c   


    SM00033 - CH (Smart link)

    Interpro entry IPR001715 : (Interpro link)

    Interpro description:

    The calponin homology domain (also known as CH-domain) is a superfamily of actin-binding domains found in both cytoskeletal proteins and signal transduction proteins. It comprises the following groups of actin-binding domains:

    A comprehensive review of proteins containing this type of actin-binding domains is given in.

    The CH domain is involved in actin binding in some members of the family. However in calponins there is evidence that the CH domain is not involved in its actin binding activity. Most proteins have two copies of the CH domain, however some proteins such as calponin and the human vav proto-oncogene have only a single copy. The structure of an example CH-domain has recently been solved.

    Proteins where this domain is known:
    PF14_0454   


    SM00044 - CYCc (Smart link)

    Interpro entry IPR001054 : Adenylyl cyclase class-3/4/guanylyl cyclase (Interpro link)

    Interpro description:

    Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.

    In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.

    Proteins where this domain is known:
    MAL13P1.301    PF11_0395    PF14_0043   


    SM00045 - DAGKa (Smart link)

    Interpro entry IPR000756 : Diacylglycerol kinase accessory region (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The DAG kinase domain is assumed to be an accessory domain. Upon cell stimulation, DAG kinase converts DAG into phosphatidate, initiating the resynthesis of phosphatidylinositols and attenuating protein kinase C activity. It catalyses the reaction: ATP + 1,2-diacylglycerol = ADP + 1,2-diacylglycerol 3-phosphate. The enzyme is stimulated by calcium and phosphatidylserine and phosphorylated by protein kinase C. This domain is always associated with

    Proteins where this domain is known:
    PF14_0681    PFI1485c   


    SM00053 - DYNc (Smart link)

    Interpro entry IPR001401 : Dynamin, GTPase region (Interpro link)

    Interpro description:

    Membrane transport between compartments in eukaryotic cells requires proteins that allow the budding and scission of nascent cargo vesicles from one compartment and their targeting and fusion with another. Dynamins are large GTPases that belong to a protein superfamily that, in eukaryotic cells, includes classical dynamins, dynamin-like proteins, OPA1, Mx proteins, mitofusins and guanylate-binding proteins/atlastins, and are involved in the scission of a wide range of vesicles and organelles. They play a role in many processes including budding of transport vesicles, division of organelles, cytokinesis and pathogen resistance.

    The minimal distinguishing architectural features that are common to all dynamins and are distinct from other GTPases are the structure of the large GTPase domain (300 amino acids) and the presence of two additional domains; the middle domain and the GTPase effector domain (GED), which are involved in oligomerization and regulation of the GTPase activity.

    This entry represents the GTPase domain, containing the GTP-binding motifs that are needed for guanine-nucleotide binding and hydrolysis. The conservation of these motifs is absolute except for the the final motif in guanylate-binding proteins. The GTPase catalytic activity can be stimulated by oligomerisation of the protein, which is mediated by interactions between the GTPase domain, the middle domain and the GED.

    Proteins where this domain is known:
    PF10_0368    PF11_0465   


    SM00054 - EFh (Smart link)

    Interpro entry IPR002048 : Calcium-binding EF-hand (Interpro link)

    Interpro description:
    Many calcium-binding proteins belong to the same evolutionary family and share a type of calcium-binding domain known as the EF-hand. This type of domain consists of a twelve residue loop flanked on both side by a twelve residue alpha-helical domain. In an EF-hand loop the calcium ion is coordinated in a pentagonal bipyramidal configuration. The six residues involved in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y, Z, -Y, -X and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding Ca (bidentate ligand).

    Proteins where this domain is known:
    MAL13P1.156    MAL8P1.79    PF07_0072    PF10_0244    PF10_0271    PF10_0301    PF11_0066    PF11_0098    PF11_0239    PF13_0211    PF14_0181    PF14_0224    PF14_0323    PF14_0420    PF14_0443    PF14_0492    PFA0345w    PFB0815w    PFC0420w    PFF0265c    PFF0520w   


    SM00064 - FYVE (Smart link)

    Interpro entry IPR000306 : (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    The FYVE zinc finger is named after four proteins that it has been found in: Fab1, YOTB/ZK632.12, Vac1, and EEA1. The FYVE finger has been shown to bind two zinc ions. The FYVE finger has eight potential zinc coordinating cysteine positions. Many members of this family also include two histidines in a motif R+HHC+XCG, where + represents a charged residue and X any residue. FYVE-type domains are divided into two known classes: FYVE domains that specifically bind to phosphatidylinositol 3-phosphate in lipid bilayers and FYVE-related domains of undetermined function. Those that bind to phosphatidylinositol 3-phosphate are often found in proteins targeted to lipid membranes that are involved in regulating membrane traffic. Most FYVE domains target proteins to endosomes by binding specifically to phosphatidylinositol-3-phosphate at the membrane surface. By contrast, the CARP2 FYVE-like domain is not optimized to bind to phosphoinositides or insert into lipid bilayers. FYVE domains are distinguished from other zinc fingers by three signature sequences: an N-terminal WxxD motif, a basic R(R/K)HHCR patch, and a C-terminal RVC motif.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF14_0574   


    SM00065 - GAF (Smart link)

    Interpro entry IPR003018 : (Interpro link)

    Interpro description:
    This domain is present in phytochromes and cGMP-specific phosphodiesterases. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyses the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. A phytochrome is a regulatory photoreceptor which exists in 2 forms that are reversibly interconvertible by light, the PR form that absorbs maximally in the red region of the spectrum, and the PFR form that absorbs maximally in the far-red region. This domain is also found in NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54.

    Proteins where this domain is known:
    PFB0510w   


    SM00072 - GuKc (Smart link)

    Interpro entry IPR008145 : (Interpro link)

    Interpro description:

    This entry represents a domain found in guanylate kinase and in L-type calcium channel.

    Guanylate kinase (GK) catalyzes the ATP-dependent phosphorylation of GMP into GDP. It is essential for recycling GMP and indirectly, cGMP. In prokaryotes (such as Escherichia coli), lower eukaryotes (such as yeast) and in vertebrates, GK is a highly conserved monomeric protein of about 200 amino acids. GK has been shown to be structurally similar to protein A57R (or SalG2R) from various strains of Vaccinia virus.

    L-type calcium channnels are formed from different alpha-1 subunit isoforms that determine the pharmacological properties of the channel, since they form the drug binding domain. Other properties, such as gating voltage-dependence, G protein modulation and kinase susceptibility, are influenced by alpha-2, delta and beta subunits.

    Proteins where this domain is known:
    PFI1420w   


    SM00086 - PAC (Smart link)

    Interpro entry IPR001610 : PAC motif (Interpro link)

    Interpro description:

    PAC motifs occur C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold.

    Proteins where this domain is known:
    PFE0695w   


    SM00088 - PINT (Smart link)

    Interpro entry IPR000717 : (Interpro link)

    Interpro description:
    A homology domain of unclear function, occurs in the C-terminal region of several regulatory components of the 26S proteasome as well as in other proteins. This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15). Apparently, all of the characterised proteins containing PCI domains are parts of larger multi-protein complexes. Proteins with PCI domains include budding yeast proteasome regulatory components Rpn3(Sun2), Rpn5, Rpn6, Rpn7and Rpn9; mammalian proteasome regulatory components p55, p58 and p44.5, and translation initiation factor 3 complex subunits p110 and INT6; Arabidopsis COP9 and FUS6/COP11; mammalian G-protein pathway suppressor GPS1, and several uncharacterised ORFs from plant, nematodes and mammals. The complete homology domain comprises approx. 200 residues, the highest conservation is found in the C-terminal half. Several of the proteins mentioned above have no detectable homology to the N-terminal half of the domain.

    Proteins where this domain is known:
    MAL13P1.190    PF10_0174    PF10_0298    PF11_0303    PF14_0025    PFD0880w    PFE1405c    PFL0310c   


    SM00100 - cNMP (Smart link)

    Interpro entry IPR000595 : (Interpro link)

    Interpro description:
    Proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues. The best studied of these proteins is the prokaryotic catabolite gene activator (also known as the cAMP receptor protein) (gene crp) where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure. There are six invariant amino acids in this domain, three of which are glycine residues that are thought to be essential for maintenance of the structural integrity of the beta-barrel. cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain. The cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain. The cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section. Vertebrate cyclic nucleotide-gated ion-channels also contain this domain. Two such cations channels have been fully characterised, one is found in rod cells where it plays a role in visual signal transduction.

    Proteins where this domain is known:
    PF14_0172    PF14_0173    PF14_0346    PFL1110c   


    SM00101 - 14_3_3 (Smart link)

    Interpro entry IPR000308 : 14-3-3 protein (Interpro link)

    Interpro description:

    The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.

    14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

    Proteins where this domain is known:
    MAL13P1.309    MAL8P1.69   


    SM00102 - ADF (Smart link)

    Interpro entry IPR002108 : Actin-binding, cofilin/tropomyosin type (Interpro link)

    Interpro description:

    The actin-depolymerising factor homology (ADF-H) domain is an ~150-amino acid motif that is present in three phylogenetically distinct classes of eukaryotic actin-binding proteins:

    Although these proteins are biochemically distinct and play different roles in actin dynamics, they all appear to use the ADF-H domain for their interactions with actin.

    The ADF-H domain consists of a six-stranded mixed beta-sheet in which the four central strands (beta2-beta5) are anti-parallel and the two edge strands (beta1 and beta6) run parallel with the neighbouring strands. The sheet is surrounded by two alpha-helices on each side .

    Proteins where this domain is known:
    PF13_0326   


    SM00105 - ArfGap (Smart link)

    Interpro entry IPR001164 : Arf GTPase activating protein (Interpro link)

    Interpro description:

    This entry describes a family of small GTPase activating proteins, for example ARF1-directed GTPase-activating protein, the cycle control GTPase activating protein (GAP) GCS1 which is important for the regulation of the ADP ribosylation factor ARF, a member of the Ras superfamily of GTP-binding proteins. The GTP-bound form of ARF is essential for the maintenance of normal Golgi morphology, it participates in recruitment of coat proteins which are required for budding and fission of membranes. Before the fusion with an acceptor compartment the membrane must be uncoated. This step required the hydrolysis of GTP associated to ARF. These proteins contain a characteristic zinc finger motif (Cys-x2-Cys-x(16,17)-x2-Cys) which displays some similarity to the C4-type GATA zinc finger. The ARFGAP domain display no obvious similarity to other GAP proteins.

    The 3D structure of the ARFGAP domain of the PYK2-associated protein beta has been solved. It consists of a three-stranded beta-sheet surrounded by 5 alpha helices. The domain is organised around a central zinc atom which is coordinated by 4 cysteines. The ARFGAP domain is clearly unrelated to the other GAP proteins structures which are exclusively helical. Classical GAP proteins accelerate GTPase activity by supplying an arginine finger to the active site. The crystal structure of ARFGAP bound to ARF revealed that the ARFGAP domain does not supply an arginine to the active site which suggests a more indirect role of the ARFGAP domain in the GTPase hydrolysis.

    The Rev protein of human immunodeficiency virus type 1 (HIV-1) facilitates nuclear export of unspliced and partly-spliced viral RNAs. Rev contains an RNA-binding domain and an effector domain; the latter is believed to interact with a cellular cofactor required for the Rev response and hence HIV-1 replication. Human Rev interacting protein (hRIP) specifically interacts with the Rev effector. The amino acid sequence of hRIP is characterised by an N-terminal, C-4 class zinc finger motif.

    Proteins where this domain is known:
    PF08_0120    PFE1305c    PFL2140c   


    SM00109 - C1 (Smart link)

    Interpro entry IPR002219 : Protein kinase C, phorbol ester/diacylglycerol binding (Interpro link)

    Interpro description:

    Diacylglycerol (DAG) is an important second messenger. Phorbol esters (PE) are analogues of DAG and potent tumour promoters that cause a variety of physiological changes when administered to both cells and tissues. DAG activates a family of serine/threonine protein kinases, collectively known as protein kinase C (PKC). Phorbol esters can directly stimulate PKC. The N-terminal region of PKC, known as C1, has been shown to bind PE and DAG in a phospholipid and zinc-dependent fashion. The C1 region contains one or two copies (depending on the isozyme of PKC) of a cysteine-rich domain, which is about 50 amino-acid residues long, and which is essential for DAG/PE-binding. The DAG/PE-binding domain binds two zinc ions; the ligands of these metal ions are probably the six cysteines and two histidines that are conserved in this domain.

    Proteins where this domain is known:
    PFI1485c   


    SM00116 - CBS (Smart link)

    Interpro entry IPR000644 : (Interpro link)

    Interpro description:

    CBS (cystathionine-beta-synthase) domains are small intracellular modules, mostly found in two or four copies within a protein, that occur in a variety of proteins in bacteria, archaea, and eukaryotes.

    Tandem pairs of CBS domains can act as binding domains for adenosine derivatives and may regulate the activity of attached enzymatic or other domains. In some cases, CBS domains may act as sensors of cellular energy status by being activated by AMP and inhibited by ATP. In chloride ion channels, the CBS domains have been implicated in intracellular targeting and trafficking, as well as in protein-protein interactions, but results vary with different channels: in the CLC-5 channel, the CBS domain was shown to be required for trafficking, while in the CLC-1 channel, the CBS domain was shown to be critical for channel function, but not necessary for trafficking. Recent experiments revealing that CBS domains can bind adenosine-containing ligands such ATP, AMP, or S-adenosylmethionine have led to the hypothesis that CBS domains function as sensors of intracellular metabolites.

    Crystallographic studies of CBS domains have shown that pairs of CBS sequences form a globular domain where each CBS unit adopts a beta-alpha-beta-beta-alpha pattern. Crystal structure of the CBS domains of the AMP-activated protein kinase in complexes with AMP and ATP shows that the phosphate groups of AMP/ATP lie in a surface pocket at the interface of two CBS domains, which is lined with basic residues, many of which are associated with disease-causing mutations.

    In humans, mutations in conserved residues within CBS domains cause a variety of human hereditary diseases, including (with the gene mutated in parentheses): homocystinuria (cystathionine beta-synthase); Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase); retinitis pigmentosa (IMP dehydrogenase-1); congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members).

    Proteins where this domain is known:
    PFI1020c   


    SM00119 - HECTc (Smart link)

    Interpro entry IPR000569 : HECT (Interpro link)

    Interpro description:

    The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.

    Proteins where this domain is known:
    MAL7P1.19    MAL8P1.23    PF11_0201    PFF1365c   


    SM00128 - IPPc (Smart link)

    Interpro entry IPR000300 : Inositol polyphosphate related phosphatase (Interpro link)

    Interpro description:

    This domain is found in diverse proteins homologous to inositol monophosphatase. These proteins are Mg2+-dependent/Li+-sensitive phosphatases. That catalyse a variety of reactions.

    Proteins where this domain is known:
    PF07_0024    PF11_0122   


    SM00129 - KISc (Smart link)

    Interpro entry IPR001752 : Kinesin, motor region (Interpro link)

    Interpro description:

    Kinesin is a microtubule-associated force-producing protein that may play a role in organelle transport. The kinesin motor activity is directed toward the microtubule's plus end. Kinesin is an oligomeric complex composed of two heavy chains and two light chains. The maintenance of the quaternary structure does not require interchain disulphide bonds.

    The heavy chain is composed of three structural domains: a large globular N-terminal domain which is responsible for the motor activity of kinesin (it is known to hydrolyse ATP, to bind and move on microtubules), a central alpha-helical coiled coil domain that mediates the heavy chain dimerisation; and a small globular C-terminal domain which interacts with other proteins (such as the kinesin light chains), vesicles and membranous organelles.

    A number of proteins have been recently found that contain a domain similar to that of the kinesin 'motor' domain:

    The kinesin motor domain is located in the N-terminal part of most of the above proteins, with the exception of KAR3, klpA, and ncd where it is located in the C-terminal section.

    The kinesin motor domain contains about 330 amino acids. An ATP-binding motif of type A is found near position 80 to 90, the C-terminal half of the domain is involved in microtubule-binding.

    Proteins where this domain is known:
    MAL8P1.132    PF07_0104    PF11_0478    PFA0535c    PFC0770c    PFC0860w    PFL0545w    PFL2165w    PFL2190c   


    SM00130 - KR (Smart link)

    Interpro entry IPR000001 : (Interpro link)

    Interpro description:
    Kringles are autonomous structural domains, found throughout the blood clotting and fibrinolytic proteins. Kringle domains are believed to play a role in binding mediators (e.g., membranes, other proteins or phospholipids), and in the regulation of proteolytic activity. Kringle domains are characterised by a triple loop, 3-disulphide bridge structure, whose conformation is defined by a number of hydrogen bonds and small pieces of anti-parallel beta-sheet. They are found in a varying number of copies in some plasma proteins including prothrombin and urokinase-type plasminogen activator, which are serine proteases belonging to MEROPS peptidase family S1A.

    Proteins where this domain is known:
    PFI0550w   


    SM00133 - S_TK_X (Smart link)

    Interpro entry IPR000961 : AGC-kinase, C-terminal (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    The AGC (cAMP-dependent, cGMP-dependent and protein kinase C) protein kinase family embraces a collection of protein kinases that display a high degree of sequence similarity within their respective kinase domains. AGC kinase proteins are characterised by three conserved phosphorylation sites that critically regulate their function. The first one is located in an activation loop in the centre of the kinase domain. The two other phosphorylation sites are located outside the kinase domain in a conserved region on its C-terminal side, the AGC-kinase C-terminal domain. These sites serves as phosphorylation-regulated switches to control both intra- and inter-molecular interactions. Without these priming phosphorylations, the kinases are catalytically inactive.

    Several structures of the AGC-kinase C-terminal domain have been solved. The first phosphorylation site is located in a turn motif, the second one at the end of the domain in an hydrophobic pocket. In PKB the phosphorylated hydrophobic motif engages a hydrophobic groove within the N-lobe of the kinase domain which orders alpha helices close to the active site.

    Proteins where this domain is known:
    PFI1685w    PFL2250c   


    SM00145 - PI3Ka (Smart link)

    Interpro entry IPR001263 : Phosphoinositide 3-kinase accessory region PIK (Interpro link)

    Interpro description:

    Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The role of the accessory domain of phosphoinositide 3-kinase (PI3-kinase) is unclear. It may be involved in substrate presentation .

    Proteins where this domain is known:
    PFE0765w   


    SM00146 - PI3Kc (Smart link)

    Interpro entry IPR000403 : Phosphatidylinositol 3- and 4-kinase, catalytic (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Phosphatidylinositol 3-kinase (PI3-kinase) is an enzyme that phosphorylates phosphoinositides on the 3-hydroxyl group of the inositol ring. The three products of PI3-kinase - PI-3-P, PI-3,4-P(2) and PI-3,4,5-P(3) function as secondary messengers in cell signalling. Phosphatidylinositol 4-kinase (PI4-kinase) is an enzyme that acts on phosphatidylinositol (PI) in the first committed step in the production of the secondary messenger inositol-1'4'5'-trisphosphate. This domain is also present in a wide range of protein kinases, involved in diverse cellular functions, such as control of cell growth, regulation of cell cycle progression, a DNA damage checkpoint, recombination, and maintenance of telomere length. Despite significant homology to lipid kinases, no lipid kinase activity has been demonstrated for any of the PIK-related kinases.

    The PI3- and PI4-kinases share a well conserved domain at their C-terminal section; this domain seems to be distantly related to the catalytic domain of protein kinases . The catalytic domain of PI3K has the typical bilobal structure that is seen in other ATP-dependent kinases, with a small N-terminal lobe and a large C-terminal lobe. The core of this domain is the most conserved region of the PI3Ks. The ATP cofactor binds in the crevice formed by the N-and C-terminal lobes, a loop between two strands provides a hydrophobic pocket for binding of the adenine moiety, and a lysine residue interacts with the alpha-phosphate. In contrast to protein kinases, the PI3K loop which interacts with the phosphates of the ATP and is known as the glycine-rich or P-loop, contains no glycine residues. Instead, contact with the ATP -phosphate is maintained through the side chain of a conserved serine residue.

    Proteins where this domain is known:
    PFD0965W    PFE0485w    PFE0765w   


    SM00148 - PLCXc (Smart link)

    Interpro entry IPR000909 : Phospholipase C, phosphatidylinositol-specific , X region (Interpro link)

    Interpro description:
    Phosphatidylinositol-specific phospholipase C, a eukaryotic intracellular enzyme, plays an important role in signal transduction processes. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins. In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC. All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as the 'X-box' and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. By profile analysis, we could show that sequences with significant similarity to the X-box domain occur also in prokaryotic and trypanosome PI-specific phospholipases C. Apart from this region, the prokaryotic enzymes show no similarity to their eukaryotic counterparts.

    Proteins where this domain is known:
    PF10_0132   


    SM00149 - PLCYc (Smart link)

    Interpro entry IPR001711 : Phospholipase C, phosphatidylinositol-specific, Y domain (Interpro link)

    Interpro description:

    Phosphatidylinositol-specific phospholipase C, an eukaryotic intracellular enzyme, plays an important role in signal transduction processes (see. It catalyzes the hydrolysis of 1-phosphatidyl-D-myo-inositol-3,4,5-triphosphate into the second messenger molecules diacylglycerol and inositol-1,4,5-triphosphate. This catalytic process is tightly regulated by reversible phosphorylation and binding of regulatory proteins.

    In mammals, there are at least 6 different isoforms of PI-PLC, they differ in their domain structure, their regulation, and their tissue distribution. Lower eukaryotes also possess multiple isoforms of PI-PLC.

    All eukaryotic PI-PLCs contain two regions of homology, sometimes referred to as 'X-box' (see and 'Y-box'. The order of these two regions is always the same (NH2-X-Y-COOH), but the spacing is variable. In most isoforms, the distance between these two regions is only 50-100 residues but in the gamma isoforms one PH domain, two SH2 domains, and one SH3 domain are inserted between the two PLC-specific domains. The two conserved regions have been shown to be important for the catalytic activity. At the C-terminal of the Y-box, there is a C2 domain (see possibly involved in Ca-dependent membrane attachment.

    Proteins where this domain is known:
    PF10_0132   


    SM00154 - ZnF_AN1 (Smart link)

    Interpro entry IPR000058 : Zinc finger, AN1-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the AN1-type zinc finger domain, which has a dimetal (zinc)-bound alpha/beta fold. This domain was first identified as a zinc finger at the C-terminus of AN1 a ubiquitin-like protein in Xenopus laevis. The AN1-type zinc finger contains six conserved cysteines and two histidines that could potentially coordinate 2 zinc atoms.

    Certain stress-associated proteins (SAP) contain AN1 domain, often in combination with A20 zinc finger domains (SAP8) or C2H2 domains (SAP16). For example, the human protein Znf216 has an A20 zinc-finger at the N-terminus and an AN1 zinc-finger at the C-terminus, acting to negatively regulate the NFkappaB activation pathway and to interact with components of the immune response like RIP, IKKgamma and TRAF6. The interact of Znf216 with IKK-gamma and RIP is mediated by the A20 zinc-finger domain, while its interaction with TRAF6 is mediated by the AN1 zinc-finger domain; therefore, both zinc-finger domains are involved in regulating the immune response. The AN1 zinc finger domain is also found in proteins containing a ubiquitin-like domain, which are involved in the ubiquitination pathway. Proteins containing an AN1-type zinc finger include:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0056    PFE0200c   


    SM00155 - PLDc (Smart link)

    Interpro entry IPR001736 : Phospholipase D/Transphosphatidylase (Interpro link)

    Interpro description:

    Phosphatidylcholine-hydrolysing phospholipase D (PLD) isoforms are activated by ADP-ribosylation factors (ARFs). PLD produces phosphatidic acid from phosphatidylcholine, which may be essential for the formation of certain types of transport vesicles or may be constitutive vesicular transport to signal transduction pathways. PC-hydrolysing PLD is a homologue of cardiolipin synthase, phosphatidylserine synthase, bacterial PLDs, and viral proteins. Each of these appears to possess a domain duplication which is apparent by the presence of two motifs containing well-conserved histidine, lysine, and/or asparagine residues which may contribute to the active site aspartic acid. An Escherichia coli endonuclease (nuc) and similar proteins appear to be PLD homologues but possess only one of these motifs.

    Proteins where this domain is known:
    PFF0465c   


    SM00156 - PP2Ac (Smart link)

    Interpro entry IPR006186 : Serine/threonine-specific protein phosphatase and bis(5-nucleosyl)-tetraphosphatase (Interpro link)

    Interpro description:

    Protein phosphorylation plays a central role in the regulation of cell functions, causing the activation or inhibition of many enzymes involved in various biochemical pathways. Kinases and phosphatases are the enzymes responsible for this, and may themselves be subject to control through the action of hormones and growth factors. Serine/threonine (S/T) phosphatases catalyse the dephosphorylation of phosphoserine and phosphothreonine residues. In mammalian tissues four different types of PP have been identified and are known as PP1, PP2A, PP2B and PP2C. Except for PP2C, these enzymes are evolutionary related. The catalytic regions of the proteins are well conserved and have a slow mutation rate, suggesting that major changes in these regions are highly detrimental.

    Protein phosphatase-1 (PP1) and protein phosphatase-2A (PP2A) have a broad specificity and there are two closely related isoforms of each, alpha and beta. PP2A is a trimeric enzyme that consists of a core composed of a catalytic subunit associated with a 65 kDa regulatory subunit and a third variable subunit. Protein phosphatase-2B (PP2B or calcineurin), a calcium-dependent enzyme whose activity is stimulated by calmodulin, is composed of two subunits the catalytic A-subunit and the calcium-binding B-subunit. The specificity of PP2B is restricted. Other serine/threonine specific protein phosphatases that have been characterised include mammalian phosphatase-X (PP-X), and Drosophila phosphatase-V (PP-V), which are closely related but yet distinct from PP2A; yeast phosphatase PPH3, which is similar to PP2A, but with different enzymatic properties; and Drosophila phosphatase-Y (PP-Y), and yeast phosphatases Z1 and Z2 which are closely related but yet distinct from PP1.

    Proteins where this domain is known:
    MAL13P1.274    PF08_0129    PF10_0177    PF14_0142    PF14_0224    PF14_0630    PF14_0660    PFC0595c    PFI1245c    PFI1360c   


    SM00160 - RanBD (Smart link)

    Interpro entry IPR000156 : Ran Binding Protein 1 (Interpro link)

    Interpro description:

    Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran Binding Protein 1 (RanBP1) has guanine nucleotide dissociation inhibitory activity, specific for the GTP form of Ran and also functions to stimulate Ran GTPase activating protein(GAP)-mediated GTP hydrolysis by Ran. RanBP1 contributes to maintaining the gradient of RanGTP across the nuclear envelope high (GDI activity) or the cytoplasmic levels of RanGTP low (GAP cofactor).

    All RanBP1 proteins contain an approx 150 amino acid residue Ran binding domain. Ran BP1 binds directly to RanGTP with high affinity. There are four sites of contact between Ran and the Ran binding domain. One of these involves binding of the C-terminal segment of Ran to a groove on the Ran binding domain that is analogous to the surface utilised in the EVH1Âpeptide interaction. Nup358 contains four Ran binding domains. The structure of the first of these is known.

    Proteins where this domain is known:
    PFD0207c    PFD0950w   


    SM00164 - TBC (Smart link)

    Interpro entry IPR000195 : RabGAP/TBC (Interpro link)

    Interpro description:
    Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, imply that these domains are GTPase activator proteins of Rab-like small GTPases.

    Proteins where this domain is known:
    MAL13P1.244    PF11_0151    PF13_0117    PF14_0699    PFE0330w    PFI0195c    PFI0345w    PFL1445w   


    SM00165 - UBA (Smart link)

    Interpro entry IPR015940 : (Interpro link)

    Interpro description:

    UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. The human homologue of yeast Rad23A is one example of a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The solution structure of human Rad23A UBA(2) showed that the domain forms a compact three-helix bundle. Comparison of the structures of UBA(1) and UBA(2) reveals that both form very similar folds and have a conserved large hydrophobic surface patch which may be a common protein-interacting surface present in diverse UBA domains. Evidence that ubiquitin binds to UBA domains leads to the prediction that the hydrophobic surface patch of UBA domains interacts with the hydrophobic surface on the five-stranded beta-sheet of ubiquitin.

    This domain is similar in sequence to the N-terminal domain of translation elongation factor EF1B (or EF-Ts) from bacteria, mitochondria and chloroplasts.

    More information about EF1B (EF-Ts) proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PF10_0114    PF11_0142    PF11_0329    PF13_0301    PFD0655c   


    SM00166 - UBX (Smart link)

    Interpro entry IPR001012 : (Interpro link)

    Interpro description:
    The UBX domain is found in ubiquitin-regulatory proteins, which are members of the ubiquitination pathway, as well as a number of other proteins including FAF-1 (FAS-associated factor 1), the human Rep-8 reproduction protein and several hypothetical proteins from yeast. The function of the UBX domain is not known although the fragment of avian FAF-1 containing the UBX domain causes apoptosis of transfected cells.

    Proteins where this domain is known:
    MAL8P1.122    PFI1680w   


    SM00173 - RAS (Smart link)

    Interpro entry IPR003577 : Ras small GTPase, Ras type (Interpro link)

    Interpro description:

    Ras proteins are small GTPases that regulate cell growth, proliferation and differentiation. The different Ras isoforms  H-ras, N-ras and K-ras  generate distinct signal outputs, despite interacting with a common set of activators and effectors. Ras is activated by guanine nucleotide exchange factors (GEFs) that release GDP and allow GTP binding. Many RasGEFs have been identified. These are sequestered in the cytosol until activation by growth factors triggers recruitment to the plasma membrane or Golgi, where the GEF colocalizes with Ras. Active GTP-bound Ras interacts with several effector proteins: among the best characterised are the Raf kinases, phosphatidylinositol 3-kinase (PI3K), RalGEFs and NORE/MST1.

    Ras proteins are synthesized as cytosolic precursors that undergo post-translational processing to be able to associate with cell membranes. First, protein farnesyl transferase, a cytosolic enzyme, attaches a farnesyl group to the cysteine residue of the CAAX motif. Second, the farnesylated CAAX sequence targets Ras to the cytosolic surface of the ER where an endopeptidase removes the AAX tripeptide. Third, the alpha-carboxyl group on the now carboxy-terminal farnesylcysteine is methylated by isoprenylcysteine carboxyl methyltransferase. Finally, after methylation, Ras proteins take one of two routes to the cell surface, which is dictated by a second targeting signal that is located immediately amino-terminal to the farnesylated cysteine. N-ras and H-ras are expressed stably on the plasma membrane, on Golgi in transfected cells, and at least transiently on the ER. Ras has also been visualized on endosomes.

    Proteins where this domain is known:
    MAL13P1.205    MAL13P1.51    PF08_0110    PF11_0183    PF11_0461    PF13_0119    PFE0625w    PFE0690c   


    SM00174 - RHO (Smart link)

    Interpro entry IPR003578 : Ras small GTPase, Rho type (Interpro link)

    Interpro description:

    Small GTPases are involved in intracellular cell signalling processes. The Ras family includes a large number of small GTPases. Members of the Rho subfamily of Ras-like small GTPases include Cdc42 and Rac, as well as Rho isoforms.

    The crystal structure of a number of the members of this entry have been determined: Rnd3/RhoE, RhoA and Cdc42.

    Proteins where this domain is known:
    MAL13P1.205    MAL13P1.51    PF08_0110    PF11_0183    PF11_0461    PF13_0119    PFE0625w    PFE0690c   


    SM00175 - RAB (Smart link)

    Interpro entry IPR003579 : Ras small GTPase, Rab type (Interpro link)

    Interpro description:

    Small GTPases are involved in intracellular cell signalling processes. The Ras family includes a large number of small GTPases. Members of the Rab GTPases subfamily have been implicated in vesicle trafficking.

    The crystal structure of a number of the members of this entry have been determined:

    Proteins where this domain is known:
    MAL13P1.205    MAL13P1.51    PF08_0110    PF10_0203    PF10_0337    PF11_0183    PF11_0461    PF13_0119    PFA0335w    PFB0500c    PFE0625w    PFE0690c    PFI0155c    PFL1500w   


    SM00176 - RAN (Smart link)

    Interpro entry IPR002041 : Ran GTPase (Interpro link)

    Interpro description:

    Ran (or TC4), is an evolutionary conserved member of the Ras superfamily of small GTPases that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran has been implicated in a large number of processes, including nucleocytoplasmic transport, RNA synthesis, processing and export and cell cycle checkpoint control. Ran plays a crucial role in both import/export pathways and determines the directionality of nuclear transport. Import receptors (importins) bind their cargos in the cytoplasm where the concentration of RanGTP is low (due to action of RanGAP), and release their cargos in the nucleus where the concentration of RanGTP is high (due to action of RanGEF). Export receptors (exportins) respond to RanGTP in the opposite manner. Furthermore, it has been shown that nuclear transport factor 2 (NTF2) stimulates efficient nuclear import of a cargo protein. NTF2 binds specifically to RanGDP and to the FXFG repeat containing nucleoporins.

    Ran is generally included in the RAS 'superfamily' of small GTP-binding proteins, but it is only slightly related to the other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at its C-terminal and is therefore not subject to prenylation. Instead, Ran has an acidic C-terminus. It is, however, similar to RAS family members in requiring a specific guanine nucleotide exchange factor (GEF) and a specific GTPase activating protein (GAP) as stimulators of overall GTPase activity.

    Ran consists of a core domain that is structurally similar to the GTP-binding domains of other small GTPases but, in addition, Ran has a C-terminal extension consisting of an unstructured linker and a 16 residue alpha-helix that is located opposite the "Switch I" region in the RanGDP structure. Three regions of Ran change conformation depending on the nucleotide bound, the Switch I and II regions, which interact with the bound nucleotide, as well as the C-terminal extension. In RanGDP, the C-terminal extension contacts the core of the protein, while in RanGTP, the extension is extending away from the core, most likely due to a steric clash between the switch I region and the linker part of the C-terminal extension. This suggests that the C-terminal extension in RanGDP is crucial for shielding residues in the core domain and preventing the switch regions from adopting a GTP-like form. This prevents binding of transport factors to RanGDP that would otherwise lead to uncoordinated interaction between importin beta-like proteins and cellular factors.

    More information about these proteins can be found at Protein of the Month: Importins.

    Proteins where this domain is known:
    MAL13P1.205    PF08_0110    PF11_0183    PF11_0461    PF13_0119    PFE0625w    PFE0690c   


    SM00177 - ARF (Smart link)

    Interpro entry IPR006688 : ADP-ribosylation factor (Interpro link)

    Interpro description:

    The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.

    The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2-beta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 ÂupÂ, restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.

    ADP-ribosylation factors (ARF) are 20 kDa GTP-binding proteins involved in protein trafficking. They may modulate vesicle budding and uncoating within the Golgi apparatus. ARF's also act as allosteric activators of cholera toxin ADP-ribosyltransferase activity. They are evolutionary conserved and present in all eukaryotes. At least six forms of ARF are present in mammals and three in budding yeast. The ARF family also includes proteins highly related to ARF's but which lack the cholera toxin cofactor activity, they are collectively known as ARL's (ARF-like). The ARFs are N-terminally myristoylated (the ARLs have not yet been shown to be modified in such a fashion).

    Proteins where this domain is known:
    MAL13P1.51    PF08_0110    PF10_0203    PF10_0337    PF11_0461    PF14_0399    PFD0810w    PFE0625w    PFE0690c   


    SM00178 - SAR (Smart link)

    Interpro entry IPR006687 : GTP-binding protein SAR1 (Interpro link)

    Interpro description:

    The small ADP ribosylation factor (Arf) GTP-binding proteins are major regulators of vesicle biogenesis in intracellular traffic. They are the founding members of a growing family that includes Arl (Arf-like), Arp (Arf-related proteins) and the remotely related Sar (Secretion-associated and Ras-related) proteins. Arf proteins cycle between inactive GDP-bound and active GTP-bound forms that bind selectively to effectors. The classical structural GDP/GTP switch is characterised by conformational changes at the so-called switch 1 and switch 2 regions, which bind tightly to the gamma-phosphate of GTP but poorly or not at all to the GDP nucleotide. Structural studies of Arf1 and Arf6 have revealed that although these proteins feature the switch 1 and 2 conformational changes, they depart from other small GTP-binding proteins in that they use an additional, unique switch to propagate structural information from one side of the protein to the other.

    The GDP/GTP structural cycles of human Arf1 and Arf6 feature a unique conformational change that affects the beta2Âbeta3 strands connecting switch 1 and switch 2 (interswitch) and also the amphipathic helical N-terminus. In GDP-bound Arf1 and Arf6, the interswitch is retracted and forms a pocket to which the N-terminal helix binds, the latter serving as a molecular hasp to maintain the inactive conformation. In the GTP-bound form of these proteins, the interswitch undergoes a two-residue register shift that pulls switch 1 and switch 2 'up', restoring an active conformation that can bind GTP. In this conformation, the interswitch projects out of the protein and extrudes the N-terminal hasp by occluding its binding pocket.

    The SAR1 protein, first identified in budding yeast, is a 21 kDa GTP- binding protein involved in vesicular transport between the endoplasmic reticulum and the Golgi. It is a GTP-binding protein that takes part in the formation of secretory vesicles by binding to an ER type II membrane protein, Sec12p. It is evolutionary conserved and seems to be present in all eukaryotes.

    SAR1 is generally included in the RAS 'superfamily' of small GTP-binding proteins, but it is only slightly related to other RAS proteins. It also differs from RAS proteins in that it lacks cysteine residues at the C terminus and is therefore not subject to prenylation. SAR1 is slightly related to ARFs.

    Proteins where this domain is known:
    PF10_0203    PF10_0337    PFD0810w   


    SM00181 - EGF (Smart link)

    Interpro entry IPR006210 : (Interpro link)

    Interpro description:

    Epidermal growth factors and transforming growth factors belong to a general class of proteins that share a repeat pattern involving a number of conserved Cys residues. Growth factors are involved in cell recognition and division. The repeating pattern, especially of cysteines (the so-called EGF repeat), is thought to be important to the 3D structure of the proteins, and hence its recognition by receptors and other molecules. The type 1 EGF signature includes six conserved cysteines believed to be involved in disulphide bond formation. The EGF motif is found frequently in nature, particularly in extracellular proteins.

    Proteins where this domain is known:
    MAL7P1.92    PF10_0027    PF10_0302    PF10_0303    PFB0305c    PFC1045c    PFE0120c    PFF0995c    PFF1120c   


    SM00182 - CULLIN (Smart link)

    Interpro entry IPR016158 : Cullin homology (Interpro link)

    Interpro description:

    Cullins are a family of hydrophobic proteins that act as scaffolds for ubiquitin ligases (E3). Cullins are found throughout eukaryotes. Humans express seven cullins (Cul1, 2, 3, 4A, 4B, 5 and 7), each forming part of a multi-subunit ubiquitin complex. Cullin-RING ubiquitin ligases (CRLs), such as Cul1 (SCF), play an essential role in targeting proteins for ubiquitin-mediated destruction; as such, they are diverse in terms of composition and function, regulating many different processes from glucose sensing and DNA replication to limb patterning and circadian rhythms. The catalytic core of CRLs consists of a RING protein and a cullin family member. For Cul1, the C-terminal cullin-homology domain binds the RING protein. The RING protein appears to function as a docking site for ubiquitin-conjugating enzymes (E2s). Other proteins contain a cullin-homology domain, such as the APC2 subunit of the anaphase-promoting complex/cyclosome and the p53 cytoplasmic anchor PARC; both APC2 and PARC have ubiquitin ligase activity. The N-terminal region of cullins is more variable, and is used to interact with specific adaptor proteins.

    This entry represents the cullin homology region, which is composed of three domains: a 4-helical bundle domain, an alpha+beta domain, and a winged helix-like domain.

    Proteins where this domain is known:
    PF08_0094    PFF1445c   


    SM00184 - RING (Smart link)

    Interpro entry IPR001841 : Zinc finger, RING-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents RING-type zinc finger domains. The RING-finger is a specialised type of Zn-finger of 40 to 60 residues that binds two atoms of zinc, and is probably involved in mediating protein-protein interactions.. There are two different variants, the C3HC4-type and a C3H2C3-type, which are clearly related despite the different cysteine/histidine pattern. The latter type is sometimes referred to as 'RING-H2 finger'. The RING domain is a protein interaction domain that has been implicated in a range of diverse biological processes. E3 ubiquitin-protein ligase activity is intrinsic to the RING domain of c-Cbl and is likely to be a general function of this domain. E3 ubiquitin-protein ligases determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3. Various RING fingers also exhibit binding to E2 ubiquitin-conjugating enzymes (Ubc's).

    Several 3D-structures for RING-fingers are known. The 3D structure of the zinc ligation system is unique to the RING domain and is referred to as the 'cross-brace' motif. The spacing of the cysteines in such a domain is C-x(2)-C-x(9 to 39)-C-x(1 to 3)-H-x(2 to 3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. Metal ligand pairs one and three co-ordinate to bind one zinc ion, whilst pairs two and four bind the second, as illustrated in the following schematic representation:

    Note that in the older literature, some RING-fingers are denoted as LIM-domains. The LIM-domain Zn-finger is a fundamentally different family, albeit with similar Cys-spacing.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.216    MAL13P1.224    MAL13P1.76    MAL7P1.155    PF10_0046    PF10_0072    PF10_0117    PF10_0276    PF11_0244    PF11_0330    PF13_0053    PF13_0188    PF14_0054    PF14_0139    PF14_0215    PF14_0416    PF14_0479    PFA0165c    PFB0440c    PFC0510w    PFC0610c    PFC0690c    PFC0740c    PFC0845c    PFD0765w    PFE0100w    PFE0610c    PFE0900w    PFE1490c    PFF0165c    PFF0355c    PFF1180w    PFF1185w    PFF1325c    PFF1440w    PFL0275w    PFL0440c    PFL0575w    PFL1010c    PFL1620w    PFL1705w    PFL2440w   


    SM00185 - ARM (Smart link)

    Interpro entry IPR000225 : (Interpro link)

    Interpro description:

    The armadillo (Arm) repeat is an approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila melanogaster segment polarity gene armadillo involved in signal transduction through wingless. Animal Arm-repeat proteins function in various processes, including intracellular signalling and cytoskeletal regulation, and include such proteins as beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumour suppressor protein, and the nuclear transport factor importin-alpha, amongst others. A subset of these proteins is conserved across eukaryotic kingdoms. In higher plants, some Arm-repeat proteins function in intracellular signalling like their mammalian counterparts, while others have novel functions.

    The 3-dimensional fold of an armadillo repeat is known from the crystal structure of beta-catenin, where the 12 repeats form a superhelix of alpha helices with three helices per unit. The cylindrical structure features a positively charged grove, which presumably interacts with the acidic surfaces of the known interaction partners of beta-catenin.

    Proteins where this domain is known:
    MAL13P1.308    PF08_0087    PF11_0318   


    SM00195 - DSPc (Smart link)

    Interpro entry IPR000340 : Protein-tyrosine phosphatase, dual specificity (Interpro link)

    Interpro description:

    Protein tyrosine (pTyr) phosphorylation is a common post-translational modification which can create novel recognition motifs for protein interactions and cellular localisation, affect protein stability, and regulate enzyme activity. Consequently, maintaining an appropriate level of protein tyrosine phosphorylation is essential for many cellular functions. Tyrosine-specific protein phosphatases (PTPase; catalyse the removal of a phosphate group attached to a tyrosine residue, using a cysteinyl-phosphate enzyme intermediate. These enzymes are key regulatory components in signal transduction pathways (such as the MAP kinase pathway) and cell cycle control, and are important in the control of cell growth, proliferation, differentiation and transformation. The PTP superfamily can be divided into four subfamilies:

    Based on their cellular localisation, PTPases are also classified as:

    All PTPases carry the highly conserved active site motif C(X)5R (PTP signature motif), employ a common catalytic mechanism, and share a similar core structure made of a central parallel beta-sheet with flanking alpha-helices containing a beta-loop-alpha-loop that encompasses the PTP signature motif. Functional diversity between PTPases is endowed by regulatory domains and subunits.

    This entry represents dual specificity protein-tyrosine phosphatases. Ser/Thr and Tyr dual specificity phosphatases are a group of enzymes with both Ser/Thr and tyrosine specific protein phosphatase activity able to remove both the serine/threonine or tyrosine-bound phosphate group from a wide range of phosphoproteins, including a number of enzymes which have been phosphorylated under the action of a kinase. Dual specificity protein phosphatases (DSPs) regulate mitogenic signal transduction and control the cell cycle. The crystal structure of a human DSP, vaccinia H1-related phosphatase (or VHR), has been determined at 2.1 angstrom resolution. A shallow active site pocket in VHR allows for the hydrolysis of phosphorylated serine, threonine, or tyrosine protein residues, whereas the deeper active site of protein tyrosine phosphatases (PTPs) restricts substrate specificity to only phosphotyrosine. Positively charged crevices near the active site may explain the enzyme's preference for substrates with two phosphorylated residues. The VHR structure defines a conserved structural scaffold for both DSPs and PTPs. A "recognition region" connecting helix alpha1 to strand beta1, may determine differences in substrate specificity between VHR, the PTPs, and other DSPs.

    These proteins may also have inactive phosphatase domains, and dependent on the domain composition this loss of catalytic activity has different effects on protein function. Inactive single domain phosphatases can still specifically bind substrates, and protect again dephosphorylation, while the inactive domains of tandem phosphatases can be further subdivided into two classes. Those which bind phosphorylated tyrosine residues may recruit multi-phosphorylated substrates for the adjacent active domains and are more conserved, while the other class have accumulated several variable amino acid substitutions and have a complete loss of tyrosine binding capability. The second class shows a release of evolutionary constraint for the sites around the catalytic centre, which emphasises a difference in function from the first group. There is a region of higher conservation common to both classes, suggesting a new regulatory centre.

    Proteins where this domain is known:
    PF11_0139   


    SM00202 - SR (Smart link)

    Interpro entry IPR017448 : Speract/scavenger receptor related (Interpro link)

    Interpro description:

    The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).

    Proteins where this domain is known:
    PF14_0067   


    SM00209 - TSP1 (Smart link)

    Interpro entry IPR000884 : (Interpro link)

    Interpro description:

    Thrombospondins are multimeric multidomain glycoproteins that function at cell surfaces and in the extracellular matrix milieu. They act as regulators of cell interactions in vertebrates. They are divided into two subfamilies, A and B, according to their overall molecular organisation. The subgroup A proteins TSP-1 and -2 contain an N-terminal domain, a VWFC domain , three TSP1 repeats, three EGF-like domains, TSP3 repeats and a C-terminal domain. They are assembled as trimer. The subgroup B thrombospondins, designated TSP-3, -4, and COMP (cartilage oligomeric matrix protein, also designated TSP-5) are distinct in that they contain unique N-terminal regions, lack the VWFC domain and TSP1 repeats, contain four copies of EGF-like domains, and are assembled as pentamers . EGF, TSP3 repeats and the C-terminal domain are thus the hallmark of a thrombospondin.

    This repeat was first described in 1986 by Lawler and Hynes. It was found in the thrombospondin protein where it is repeated 3 times. Now a number of proteins involved in the complement pathway (properdin, C6, C7, C8A, C8B, C9) as well as extracellular matrix protein like mindin, F-spondin, SCO-spondin and even the circumsporozoite surface protein 2 and TRAP proteins of Plasmodium contain one or more instance of this repeat. It has been involved in cell-cell interraction, inhibition of angiogenesis and apoptosis.

    The intron-exon organisation of the properdin gene confirms the hypothesis that the repeat might have evolved by a process involving exon shuffling. A study of properdin structure provides some information about the structure of the thrombospondin type I repeat.

    Proteins where this domain is known:
    MAL8P1.45    PF13_0201    PFA0200w    PFC0210c    PFC0640w    PFF0800w    PFL0870w   


    SM00212 - UBCc (Smart link)

    Interpro entry IPR000608 : Ubiquitin-conjugating enzyme, E2 (Interpro link)

    Interpro description:

    The post-translational attachment of ubiquitin to proteins (ubiquitinylation) alters the function, location or trafficking of a protein, or targets it to the 26S proteasome for degradation. Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. The E1 enzyme mediates an ATP-dependent transfer of a thioester-linked ubiquitin molecule to a cysteine residue on the E2 enzyme. The E2 enzyme then either transfers the ubiquitin moiety directly to a substrate, or to an E3 ligase, which can also ubiquitinylate a substrate.

    There are several different E2 enzymes (over 30 in humans), which are broadly grouped into four classes, all of which have a core catalytic domain (containing the active site cysteine), and some of which have short N- and C-terminal amino acid extensions: class I enzymes consist of just the catalytic core domain (UBC), class II possess a UBC and a C-terminal extension, class III possess a UBC and an N-terminal extension, and class IV possess a UBC and both N- and C-terminal extensions. These extensions appear to be important for some subfamily function, including E2 localisation and protein-protein interactions. In addition, there are proteins with an E2-like fold that are devoid of catalytic activity, but which appear to assist in poly-ubiquitin chain formation.

    Proteins where this domain is known:
    MAL13P1.227    PF08_0085    PF10_0330    PF13_0301    PF14_0128    PFC0255c    PFC0855w    PFE1350c    PFF0305c    PFI0740c    PFI1030c    PFL0190w    PFL2100w    PFL2175w   


    SM00213 - UBQ (Smart link)

    Interpro entry IPR000626 : Ubiquitin (Interpro link)

    Interpro description:

    Ubiquitinylation is an ATP-dependent process that involves the action of at least three enzymes: a ubiquitin-activating enzyme (E1), a ubiquitin-conjugating enzyme (E2), and a ubiquitin ligase (E3, which work sequentially in a cascade. There are many different E3 ligases, which are responsible for the type of ubiquitin chain formed, the specificity of the target protein, and the regulation of the ubiquitinylation process. Ubiquitinylation is an important regulatory tool that controls the concentration of key signalling proteins, such as those involved in cell cycle control, as well as removing misfolded, damaged or mutant proteins that could be harmful to the cell. Several ubiquitin-like molecules have been discovered, such as Ufm1, SUMO1, NEDD8, Rad23, Elongin B and Parkin, the latter being involved in Parkinson's disease.

    Ubiquitin is a protein of 76 amino acid residues, found in all eukaryotic cells and whose sequence is extremely well conserved from protozoan to vertebrates. Ubiquitin acts through its post-translational attachment (ubiquitinylation) to other proteins, where these modifications alter the function, location or trafficking of the protein, or targets it for destruction by the 26S proteasome. The terminal glycine in the C-terminal 4-residue tail of ubiquitin can form an isopeptide bond with a lysine residue in the target protein, or with a lysine in another ubiquitin molecule to form a ubiquitin chain that attaches itself to a target protein. Ubiquitin has seven lysine residues, any one of which can be used to link ubiquitin molecules together, resulting in different structures that alter the target protein in different ways. It appears that Lys(11)-, Lys(29) and Lys(48)-linked poly-ubiquitin chains target the protein to the proteasome for degradation, while mono-ubiquitinylated and Lys(6)- or Lys(63)-linked poly-ubiquitin chains signal reversible modifications in protein activity, location or trafficking. For example, Lys(63)-linked poly-ubiquitinylation is known to be involved in DNA damage tolerance, inflammatory response, protein trafficking and signal transduction through kinase activation. In addition, the length of the ubiquitin chain alters the fate of the target protein. Regulatory proteins such as transcription factors and histones are frequent targets of ubquitinylation.

    Proteins where this domain is known:
    MAL13P1.64    PF10_0114    PF11_0142    PF13_0084    PF13_0346    PF14_0027    PFE0285c    PFE0380c    PFE1355c    PFI1085w    PFL0585w    PFL1830w   


    SM00219 - TyrKc (Smart link)

    Interpro entry IPR001245 : Tyrosine protein kinase (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Tyrosine phosphorylating activity was originally detected in two viral transforming proteins, but many retroviral transforming proteins and their cellular counterparts have since been shown to possess such activity. The growth factor receptors, which are activated by ligand binding, and the insulin-related peptide receptor, are also family members.

    Proteins where this domain is known:
    MAL13P1.185    MAL13P1.279    MAL7P1.100    MAL7P1.18    PF08_0044    PF11_0147    PF14_0346    PFB0815w    PFI1685w   


    SM00220 - S_TKc (Smart link)

    Interpro entry IPR002290 : Serine/threonine protein kinase (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    Eukaryotic protein kinases are enzymes that belong to a very extensive family of proteins which share a conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a lysine residue, which has been shown to be involved in ATP binding. In the central part of the catalytic domain there is a conserved aspartic acid residue which is important for the catalytic activity of the enzyme.

    Proteins where this domain is known:
    MAL13P1.185    MAL13P1.278    MAL13P1.279    MAL7P1.100    MAL7P1.175    MAL7P1.18    MAL8P1.203    PF07_0072    PF08_0044    PF10_0141    PF11_0096    PF11_0147    PF11_0239    PF11_0242    PF13_0085    PF13_0211    PF14_0227    PF14_0294    PF14_0346    PF14_0392    PF14_0431    PF14_0476    PF14_0516    PFB0815w    PFC0385c    PFC0420w    PFC0525c    PFD0865c    PFE1290w    PFF0260w    PFF0520w    PFI1685w    PFL1370w    PFL1885c    PFL2250c   


    SM00222 - Sec7 (Smart link)

    Interpro entry IPR000904 : SEC7-like (Interpro link)

    Interpro description:
    The SEC7 domain was named after the first protein found to contain such a region. It has been shown to be linked with guanine nucleotide exchange function. The 3D structure of the domain displays several alpha-helices. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian factors.

    Proteins where this domain is known:
    PF14_0407   


    SM00225 - BTB (Smart link)

    Interpro entry IPR000210 : BTB/POZ-like (Interpro link)

    Interpro description:
    The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N terminus of a fraction of zinc finger proteins and in proteins that contain themotif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.

    Proteins where this domain is known:
    PF13_0238    PFL1875w   


    SM00228 - PDZ (Smart link)

    Interpro entry IPR001478 : PDZ/DHR/GLGF (Interpro link)

    Interpro description:

    PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.

    PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.

    Proteins where this domain is known:
    MAL8P1.98    PFC0330w   


    SM00230 - CysPc (Smart link)

    Interpro entry IPR001300 : Peptidase C2, calpain (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.

    All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:

    1. A 19-amino acid NH2-terminal sequence;
    2. Active site domain IIa;
    3. Active site domain IIb.

      Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.

    4. Domain III;
    5. An 18-amino acid extended sequence linking domain III to domain IV;
    6. Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity. />. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad.

    Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.

    Proteins where this domain is known:
    MAL13P1.310   


    SM00232 - JAB_MPN (Smart link)

    Interpro entry IPR000555 : (Interpro link)

    Interpro description:

    Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain. It has been shown that this domain occurs in prokaryotes.

    Mov34 proteins act as the regulatory subunit of the 26 proteasome, which is involved in the ATP-dependent degradation of ubiquitinated proteins. The function of this domain is unclear, but it is found in the N-terminus of the proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors.

    A number of the proteins associated with this family belong to MEROPS peptidase family M67 (clan M-). This includes the Poh1 peptidase of Saccharomyces cerevisiae (Baker's yeast) which is a component of the 19S proteasome regulatory particle.

    Proteins where this domain is known:
    MAL13P1.343    PFD0265w    PFI0630w    PFI0895c   


    SM00233 - PH (Smart link)

    Interpro entry IPR001849 : (Interpro link)

    Interpro description:

    The 'pleckstrin homology' (PH) domain is a domain of about 100 residues that occurs in a wide range of proteins involved in intracellular signalling or as constituents of the cytoskeleton.

    The function of this domain is not clear, several putative functions have been suggested:

  • binding to the beta/gamma subunit of heterotrimeric G proteins,
  • binding to lipids, e.g. phosphatidylinositol-4,5-bisphosphate,
  • binding to phosphorylated Ser/Thr residues,
  • attachment to membranes by an unknown mechanism.
  • It is possible that different PH domains have totally different ligand requirements.

    The 3D structure of several PH domains has been determined. All known cases have a common structure consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic helix. The loops connecting the beta-strands differ greatly in length, making the PH domain relatively difficult to detect. There are no totally invariant residues within the PH domain.

    Proteins reported to contain one more PH domains belong to the following families:

    Proteins where this domain is known:
    MAL13P1.188    MAL13P1.256    MAL13P1.306    PF11_0242    PF11_0327    PF11_0424    PFB0257c    PFD0705c   


    SM00238 - BIR (Smart link)

    Interpro entry IPR001370 : Proteinase inhibitor I32, inhibitor of apoptosis (Interpro link)

    Interpro description:

    Peptide proteinase inhibitors can be found as single domain proteins or as single or multiple domains within proteins; these are referred to as either simple or compound inhibitors, respectively. In many cases they are synthesised as part of a larger precursor protein, either as a prepropeptide or as an N-terminal domain associated with an inactive peptidase or zymogen. This domain prevents access of the substrate to the active site. Removal of the N-terminal inhibitor domain either by interaction with a second peptidase or by autocatalytic cleavage activates the zymogen. Other inhibitors interact direct with proteinases using a simple noncovalent lock and key mechanism; while yet others use a conformational change-based trapping mechanism that depends on their structural and thermodynamic properties.

    The baculovirus inhibitor of apoptosis protein repeat (BIR) is a domain of tandem repeats separated by a variable length linker that seems to confer cell death-preventing activity. The BIR domains characterise the Inhibitor of Apoptosis (IAP) family of proteins (MEROPS proteinase inhibitor family I32, clan IV) that suppress apoptosis by interacting with and inhibiting the enzymatic activity of both initiator and effector caspases (MEROPS peptidase family C14). Several distinct mammalian IAPs including XIAP, c-IAP1, c-IAP2, and ML-IAP, have been identified, and they all exhibit antiapoptotic activity in cell culture. The functional unit in each IAP protein is the baculoviral IAP repeat (BIR), which contains approximately 80 amino acids folded around a zinc atom. Most mammalian IAPs have more than one BIR domain, with the different BIR domains performing distinct functions. For example, in XIAP, the third BIR domain (BIR3) potently inhibits the catalytic activity of caspase-9, whereas the linker sequences immediately preceding the second BIR domain (BIR2) selectively targets caspase-3 or Â7.

    The first-recognised members of family MEROPS inhibitor family I32 were viral proteins that inhibited the apoptosis of infected cells: Cp-IAP from Cydia pomonella granulosis virus (CpGV) and Op-IAP from Orgyia pseudotsugata multicapsid polyhedrosis virus(OpMNPV). The discovery of homologous proteins in mammals followed soon after with the recognition that mutations in the gene for neuronal apoptosis inhibitory protein (NIAP) underlie spinal muscular atrophy. The inhibitors in family I32 all possess one or more 80-residue domains known as BIR (baculovirus inhibitor repeat) domains and have accordingly been termed 'BIR-containing' or 'BIRC' proteins as well as IAP proteins.

    The mechanism of inhibition of caspases by the IAP proteins is complex, and reactive site residues cannot yet be identified with any confidence. Despite the conservation of the BIR or IAP (inhibitor of apoptosis) domains throughout the family it seems clear that other parts of the molecules also make essential contributions to inhibitory activity.

    Homologs of most components in the mammalian apoptotic pathway have been identified in fruit flies. The Drosophila Apaf-1, known as Dapaf-1, HAC-1 or Dark, shares significant sequence similarity with its mammalian counterpart, and is critically important for the activation of the Drosophila initiator caspase Dronc. Dronc, in turn, cleaves and activates the effector caspase DrICE. The Drosophila IAP, DIAP1, binds to and in-activates both DrICE and Dronc through its BIR1 and BIR2 domains. During apoptosis, the anti-death function of DIAP1 is countered by at least four pro-apoptotic proteins, Reaper, Hid, Grim, and sickle, through direct physical interactions. These four proteins represent the functional homologs of the mammalian protein Smac, and they all share a conserved IAP-binding motif at their N termini. The three proteins Reaper, Hid, and Grim are collectively referred to as the RHG proteins.

    Both XIAP and DIAP1 contain a RING domain at their C termini, and can act as an E3 ubiquitin ligase. Indeed, both XIAP and DIAP1 have been shown to promote self-ubiquitination and degradation as well as to negatively regulate the target caspases. Nonetheless, important differences exist between XIAP and DIAP1. The primary function of XIAP is thought to inhibit the catalytic activities of caspases; to what extent the ubiquitinating activity of XIAP contributes to its function remains unclear. For DIAP1, however, the ubiquitinating activity appears to be essential for its function.

    Recently a Drosophila p53 protein has been identified that mediates apoptosis via a novel pathway involving the activation of the Reaper gene and subsequent inhibition of the inhibitors of apoptosis (IAPs). CIAP1, a major mammalian homolog of Drosophila IAPs, is irreversibly inhibited (cleaved) during p53-dependent apoptosis and this cleavage is mediated by a serine protease. Serine protease inhibitors that block CIAP1 cleavage inhibit p53-dependent apoptosis. Furthermore, activation of the p53 protein increases the transcription of the HTRA2 gene, which encodes a serine protease that interacts with CIAP1 and potentiates apoptosis. Therefore mammalian p53 protein activates apoptosis through a novel pathway functionally similar to that in Drosophila, which involves HTRA2 and subsequent inhibition of CIAP1 by cleavage.

    Proteins where this domain is known:
    PFE0985w   


    SM00239 - C2 (Smart link)

    Interpro entry IPR000008 : (Interpro link)

    Interpro description:
    The C2 domain is a Ca2+-dependent membrane-targeting module found in many cellular proteins involved in signal transduction or membrane trafficking. C2 domains are unique among membrane targeting domains in that they show wide range of lipid selectivity for the major components of cell membranes, including phosphatidylserine and phosphatidylcholine. This C2 domain is about 116 amino-acid residues and is located between the two copies of the C1 domain in Protein Kinase C (that bind phorbol esters and diacylglycerol) (see and the protein kinase catalytic domain (see. Regions with significant homology to the C2-domain have been found in many proteins. The C2 domain is thought to be involved in calcium-dependent phospholipid binding and in membrane targetting processes such as subcellular localisation.

    The 3D structure of the C2 domain of synaptotagmin has been reported, the domain forms an eight-stranded beta sandwich constructed around a conserved 4-stranded motif, designated a C2 key. Calcium binds in a cup-shaped depression formed by the N- and C-terminal loops of the C2-key motif. Structural analyses of several C2 domains have shown them to consist of similar ternary structures in which three Ca2+-binding loops are located at the end of an 8 stranded antiparallel beta sandwich.

    Proteins where this domain is known:
    MAL8P1.134    PF11_0107    PF14_0530    PFF0185c    PFL2110c   


    SM00240 - FHA (Smart link)

    Interpro entry IPR000253 : (Interpro link)

    Interpro description:

    The forkhead-associated (FHA) domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contain small helical insertions between the loops connecting the strands.

    To date, genes encoding FHA-containing proteins have been identified in eubacterial and eukaryotic but not archaeal genomes. The domain is present in a diverse range of proteins, such as kinases, phosphatases, kinesins, transcription factors, RNA-binding proteins and metabolic enzymes which partake in many different cellular processes - DNA repair, signal transduction, vesicular transport and protein degradation are just a few examples.

    Proteins where this domain is known:
    MAL13P1.405    PF11_0347    PF13_0042    PFI0470w    PFL0275w   


    SM00242 - MYSc (Smart link)

    Interpro entry IPR001609 : Myosin head, motor region (Interpro link)

    Interpro description:

    Muscle contraction is caused by sliding between the thick and thin filaments of the myofibril. Myosin is a major component of thick filaments and exists as a hexamer of 2 heavy chains, 2 alkali light chains, and 2 regulatory light chains. The heavy chain can be subdivided into the N-terminal globular head and the C-terminal coiled-coil rod-like tail, although some forms have a globular region in their C-terminal. There are many cell-specific isoforms of myosin heavy chains, coded for by a multi-gene family. Myosin interacts with actin to convert chemical energy, in the form of ATP, to mechanical energy. The 3-D structure of the head portion of myosin has been determined and a model for actin-myosin complex has been constructed.

    The globular head is well conserved, some highly-conserved regions possibly relating to functional and structural domains. The rod-like tail starts with an invariant proline residue, and contains many repeats of a 28 residue region, interrupted at 4 regularly-spaced points known as skip residues. Although the sequence of the tail is not well conserved, the chemical character is, hydrophobic, charged and skip residues occuring in a highly ordered and repeated fashion.

    Proteins where this domain is known:
    MAL13P1.148    PF11_0416    PF13_0233    PFE0175c    PFF0675c    PFL1435c   


    SM00244 - PHB (Smart link)

    Interpro entry IPR001107 : (Interpro link)

    Interpro description:
    The band 7 protein is an integral membrane protein which is thought to regulate cation conductance. A variety of proteins belong to this family. These include the prohibitins, cytoplasmic anti-proliferative proteins and stomatin, an erythrocyte membrane protein. Bacterial HflC protein also belongs to this family.

    Proteins where this domain is known:
    PF08_0006    PF10_0144    PFC0800w   


    SM00248 - ANK (Smart link)

    Interpro entry IPR002110 : (Interpro link)

    Interpro description:

    The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.

    The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.

    Proteins where this domain is known:
    MAL13P1.126    MAL13P1.71    MAL8P1.28    PF10_0102    PF10_0213    PF10_0328    PF11_0197    PF11_0439    PF14_0106    PF14_0222    PFC0160w    PFE0400w    PFF1315w    PFF1365c    PFL2200w   


    SM00249 - PHD (Smart link)

    Interpro entry IPR001965 : Zinc finger, PHD-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the PHD (homeodomain) zinc finger domain, which is a C4HC3 zinc-finger-like motif found in nuclear proteins thought to be involved in chromatin-mediated transcriptional regulation. The PHD finger motif is reminiscent of, but distinct from the C3HC4 type RING finger.

    The function of this domain is not yet known but in analogy with the LIM domain it could be involved in protein-protein interaction and be important for the assembly or activity of multicomponent complexes involved in transcriptional activation or repression. Alternatively, the interactions could be intra-molecular and be important in maintaining the structural integrity of the protein. In similarity to the RING finger and the LIM domain, the PHD finger is thought to bind two zinc ions.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.122    MAL13P1.302    PF10_0079    PF11_0429    PF14_0315    PFC0425w    PFF1185w    PFF1440w    PFL0575w    PFL1010c   


    SM00259 - ZnF_A20 (Smart link)

    Interpro entry IPR002653 : Zinc finger, A20-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the zinc finger domain found in A20. A20 is an inhibitor of cell death that inhibits NF-kappaB activation via the tumour necrosis factor receptor associated factor pathway. The zinc finger domains appear to mediate self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0056   


    SM00268 - ACTIN (Smart link)

    Interpro entry IPR004000 : Actin/actin-like (Interpro link)

    Interpro description:

    Actin is a ubiquitous protein involved in the formation of filaments that are major components of the cytoskeleton. These filaments interact with myosin to produce a sliding effect, which is the basis of muscular contraction and many aspects of cell motility, including cytokinesis. Each actin protomer binds one molecule of ATP and has one high affinity site for either calcium or magnesium ions, as well as several low affinity sites. Actin exists as a monomer in low salt concentrations, but filaments form rapidly as salt concentration rises, with the consequent hydrolysis of ATP. Actin from many sources forms a tight complex with deoxyribonuclease (DNase I) although the significance of this is still unknown. The formation of this complex results in the inhibition of DNase I activity, and actin loses its ability to polymerise. It has been shown that an ATPase domain of actin shares similarity with ATPase domains of hexokinase and hsp70 proteins.

    In vertebrates there are three groups of actin isoforms: alpha, beta and gamma. The alpha actins are found in muscle tissues and are a major constituent of the contractile apparatus. The beta and gamma actins co-exists in most cell types as components of the cytoskeleton and as mediators of internal cell motility. In plants there are many isoforms which are probably involved in a variety of functions such as cytoplasmic streaming, cell shape determination, tip growth, graviperception, cell wall deposition, etc.

    Recently some divergent actin-like proteins have been identified in several species. These proteins include centractin (actin-RPV) from mammals, fungi yeast ACT5, Neurospora crassa ro-4) and Pneumocystis carinii, which seems to be a component of a multi-subunit centrosomal complex involved in microtubule based vesicle motility (this subfamily is known as ARP1); ARP2 subfamily, which includes chicken ACTL, Saccharomyces cerevisiae ACT2, Drosophila melanogaster 14D and Caenorhabditis elegans actC; ARP3 subfamily, which includes actin 2 from mammals, Drosophila 66B, yeast ACT4 and Schizosaccharomyces pombe act2; and ARP4 subfamily, which includes yeast ACT3 and Drosophila 13E.

    Proteins where this domain is known:
    PF07_0077    PF11_0047    PF11_0114    PF14_0124    PF14_0218    PFA0190c    PFD0487c    PFE0255w    PFL2215w   


    SM00271 - DnaJ (Smart link)

    Interpro entry IPR001623 : Heat shock protein DnaJ, N-terminal (Interpro link)

    Interpro description:

    The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.

    Such a structure is shown in the following schematic representation:

    It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.

    Proteins where this domain is known:
    MAL13P1.162    MAL13P1.277    MAL8P1.204    PF07_0103    PF08_0032    PF08_0115    PF10_0032    PF10_0058    PF10_0378    PF10_0381    PF11_0034    PF11_0099    PF11_0273    PF11_0380    PF11_0433    PF11_0443    PF11_0509    PF11_0512    PF11_0513    PF13_0036    PF13_0102    PF14_0013    PF14_0137    PF14_0213    PF14_0359    PF14_0700    PFA0110w    PFA0660w    PFA0675w    PFB0085c    PFB0090c    PFB0595w    PFB0920w    PFB0925w    PFD0462w    PFE0040c    PFE0055c    PFE0135w    PFE1170w    PFF1010c    PFF1415c    PFI0855w    PFI0935w    PFL0055c    PFL0565w    PFL0815w    PFL2550w   


    SM00278 - HhH1 (Smart link)

    Interpro entry IPR003583 : Helix-hairpin-helix DNA-binding motif, class 1 (Interpro link)

    Interpro description:
    The HhH motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins. The HhH motif is similar to, but distinct from, the HtH motif. Both of these motifs have two helices connected by a short turn. In the HtH motif the second helix binds to DNA with the helix in the major groove. This allow the contact between specific base and residues throughout the protein. In the HhH motif the second helix does not protrude from the surface of the protein and therefore cannot lie in the major groove of the DNA. Crystallographic studies suggest that the interaction of the HhH domain with DNA is mediated by amino acids located in the strongly conserved loop (L-P-G-V) and at the N-terminal end of the second helix. This interaction could involve the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups. The structural difference between the HtH and HhH domains is reflected at the functional level: whereas the HtH domain, found primarily in gene regulatory proteins, binds DNA in a sequence specific manner, the HhH domain is rather found in proteins involved in enzymatic activities and binds DNA with no sequence specificity.

    Proteins where this domain is known:
    PF11_0087    PFF0715c   


    SM00279 - HhH2 (Smart link)

    Interpro entry IPR008918 : Helix-hairpin-helix motif, class 2 (Interpro link)

    Interpro description:

    The helix-hairpin-helix (HhH) motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins. The HhH motif is similar to, but distinct from, the helix-turn-helix (HtH) and the helix-loop-helix (HLH) motifs. All three motifs have two helices (H1 and H2) connected by a short turn. DNA-binding proteins with a HhH structural motif are involved in non-sequence-specific DNA binding that occurs via the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups. These HhH motifs are observed in DNA repair enzymes and in DNA polymerases. By contrast, proteins with a HtH motif bind DNA in a sequence-specific manner through the binding of H2 with the major groove; these proteins are primarily gene regulatory proteins. DNA-binding proteins with the HLH structural motif are transcriptional regulatory proteins and are principally related to a wide array of developmental processes.

    Examples of proteins that contain a HhH motif include the 5'-exonuclease domains of prokaryotic DNA polymerases, the eukaryotic/prokaryotic RAD2 family of 5'-3' exonucleases such as T4 RNase H and T5, eukaryotic 5' endonucleases such as FEN-1 (Flap), and some viral exonucleases.

    Proteins where this domain is known:
    PF10_0080    PFB0180w    PFB0265c    PFD0420c   


    SM00290 - ZnF_UBP (Smart link)

    Interpro entry IPR001607 : Zinc finger, UBP-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents UBP-type zinc finger domains, which display some similarity with the Zn-binding domain of the insulinase family. The UBP-type zinc finger domain is found only in a small subfamily of ubiquitin C-terminal hydrolases (deubiquitinases or UBP), All members of this subfamily are isopeptidase-T, which are known to cleave isopeptide bonds between ubiquitin moieties.

    Some of the proteins containing an UBP zinc finger include:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL7P1.120    PFD0655c   


    SM00291 - ZnF_ZZ (Smart link)

    Interpro entry IPR000433 : Zinc finger, ZZ-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents ZZ-type zinc finger domains, named because of their ability to bind two zinc ions. These domains contain 4-6 Cys residues that participate in zinc binding (plus additional Ser/His residues), including a Cys-X2-Cys motif found in other zinc finger domains. These zinc fingers are thought to be involved in protein-protein interactions. The structure of the ZZ domain shows that it belongs to the family of cross-brace zinc finger motifs that include the PHD, RING, and FYVE domains. ZZ-type zinc finger domains are found in:

    Single copies of the ZZ zinc finger occur in the transcriptional adaptor/coactivator proteins P300, in cAMP response element-binding protein (CREB)-binding protein (CBP) and ADA2. CBP provides several binding sites for transcriptional coactivators. The site of interaction with the tumour suppressor protein p53 and the oncoprotein E1A with CBP/P300 is a Cys-rich region that incorporates two zinc-binding motifs: ZZ-type and TAZ2-type. The ZZ-type zinc finger of CBP contains two twisted anti-parallel beta-sheets and a short alpha-helix, and binds two zinc ions. One zinc ion is coordinated by four cysteine residues via 2 Cys-X2-Cys motifs, and the third zinc ion via a third Cys-X-Cys motif and a His-X-His motif. The first zinc cluster is strictly conserved, whereas the second zinc cluster displays variability in the position of the two His residues.

    In Arabidopsis thaliana (Mouse-ear cress), the hypersensitive to red and blue 1 (Hrb1) protein, which regulating both red and blue light responses, contains a ZZ-type zinc finger domain.

    ZZ-type zinc finger domains have also been identified in the testis-specific E3 ubiquitin ligase MEX that promotes death receptor-induced apoptosis. MEX has four putative zinc finger domains: one ZZ-type, one SWIM-type and two RING-type. The region containing the ZZ-type and RING-type zinc fingers is required for interaction with UbcH5a and MEX self-association, whereas the SWIM domain was critical for MEX ubiquitination.

    In addition, the Cys-rich domains of dystrophin, utrophin and an 87kDa post-synaptic protein contain a ZZ-type zinc finger with high sequence identity to P300/CBP ZZ-type zinc fingers. In dystrophin and utrophin, the ZZ-type zinc finger lies between a WW domain (flanked by and EF hand) and the C-terminal coiled-coil domain. Dystrophin is thought to act as a link between the actin cytoskeleton and the extracellular matrix, and perturbations of the dystrophin-associated complex, for example, between dystrophin and the transmembrane glycoprotein beta-dystroglycan, may lead to muscular dystrophy. Dystrophin and its autosomal homologue utrophin interact with beta-dystroglycan via their C-terminal regions, which are comprised of a WW domain, an EF hand domain and a ZZ-type zinc finger domain. The WW domain is the primary site of interaction between dystrophin or utrophin and dystroglycan, while the EF hand and ZZ-type zinc finger domains stabilise and strengthen this interaction.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF10_0143   


    SM00292 - BRCT (Smart link)

    Interpro entry IPR001357 : BRCT (Interpro link)

    Interpro description:

    The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.

    A chitin biosynthesis protein from yeast also seems to belong to this group.

    Proteins where this domain is known:
    PF11_0090    PFB0895c    PFI0510c   


    SM00297 - BROMO (Smart link)

    Interpro entry IPR001487 : (Interpro link)

    Interpro description:
    Bromodomains are found in a variety of mammalian, invertebrate and yeast DNA-binding proteins. Bromodomains can interact with acetylated lysine. In some proteins, the classical bromodomain has diverged to such an extent that parts of the region are either missing or contain an insertion (e.g., mammalian protein HRX, Caenorhabditis elegans hypothetical protein ZK783.4, yeast protein YTA7). The bromodomain may occur as a single copy, or in duplicate.

    The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.

    Proteins where this domain is known:
    PF08_0034    PF10_0328    PF14_0724    PFA0510w    PFF1440w    PFL0635c    PFL1645w   


    SM00298 - CHROMO (Smart link)

    Interpro entry IPR000953 : Chromo domain (Interpro link)

    Interpro description:
    The CHROMO (CHRromatin Organization MOdifier) domain is a conserved region of around 60 amino acids, originally identified in Drosophila modifiers of variegation. These are proteins that alter the structure of chromatin to the condensed morphology of heterochromatin, a cytologically visible condition where gene expression is repressed. In one of these proteins, Polycomb, the chromo domain has been shown to be important for chromatin targeting. Proteins that contain a chromo domain appear to fall into 3 classes. The first class includes proteins having an N-terminal chromo domain followed by a region termed the chromo shadow domain, eg. Drosophila and human heterochromatin protein Su(var)205 (HP1). The second class includes proteins with a single chromo domain, eg. Drosophila protein Polycomb (Pc); mammalian modifier 3; human Mi-2 autoantigenand and several yeast and Caenorhabditis elegans hypothetical proteins. In the third class paired tandem chromo domains are found, eg. in mammalian DNA-binding/helicase proteins CHD-1 to CHD-4 and yeast protein CHD1.

    Proteins where this domain is known:
    PF10_0232    PF11_0192    PF11_0418    PFL1005c   


    SM00299 - CLH (Smart link)

    Interpro entry IPR000547 : Clathrin, heavy chain/VPS, 7-fold repeat (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.

    This entry represents the 7-fold alpha-alpha-superhelical ARM-type repeat found at the C-terminal of clathrin heavy chains and in VPS (vacuolar protein sorting-associated) proteins. In clathrin heavy chains, the C-terminal 7-fold ARM-type repeats interact to form the central hub of the triskelion. VPS proteins are required for vacuolar assembly and vacuolar traffick, and contain one clathrin-type repeat.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PFL0930w   


    SM00302 - GED (Smart link)

    Interpro entry IPR003130 : Dynamin GTPase effector (Interpro link)

    Interpro description:

    Dynamin GTPase effector domain found in proteins related to dynamin.

    Dynamin is a GTP-hydrolysing protein that is an essential participant in clathrin-mediated endocytosis by cells. It self-assembles into 'collars' in vivo at the necks of invaginated coated pits; the self-assembly of dynamin being coordinated by the GTPase domain. Mutation studies indicate that dynamin functions as a molecular regulator of receptor-mediated endocytosis.

    Proteins where this domain is known:
    PF10_0368    PF11_0465   


    SM00311 - PWI (Smart link)

    Interpro entry IPR002483 : Splicing factor PWI (Interpro link)

    Interpro description:

    The PWI domain, named after a highly conserved PWI tri-peptide located within its N-terminal region, is a ~80 amino acid module, which is found either at the N-terminus or at the C-terminus of eukaryotic proteins involved in pre-mRNA processing. It is generally found in association with other domains such as RRM and RS. The PWI domain is a RNA/DNA-binding domain that has an equal preference for single- and double-stranded nucleic acids and is likely to have multiple important functions in pre-mRNA processing. Proteins containing this domain include the SR-related nuclear matrix protein of 160 kD (SRm160) splicing and 3'-end cleavage-stimulatory factor, and the mammalian splicing factor PRP3.

    The PWI domain is a soluble, globular and independently folded domain which consists of a four-helix bundle, with structured N- and C-terminal elements.

    Proteins where this domain is known:
    PFC0465c   


    SM00312 - PX (Smart link)

    Interpro entry IPR001683 : Phox-like (Interpro link)

    Interpro description:

    The PX (phox) domain occurs in a variety of eukaryotic proteins and have been implicated in highly diverse functions such as cell signalling, vesicular trafficking, protein sorting and lipid modification. PX domains are important phosphoinositide-binding modules that have varying lipid-binding specificities. The PX domain is approximately 120 residues long, and folds into a three-stranded beta-sheet followed by three -helices and a proline-rich region that immediately preceeds a membrane-interaction loop and spans approximately eight hydrophobic and polar residues. The PX domain of p47phox binds to the SH3 domain in the same protein. Phosphorylation of p47(phox), a cytoplasmic activator of the microbicidal phagocyte oxidase (phox), elicits interaction of p47(phox) with phoinositides. The protein phosphorylation-driven conformational change of p47(phox) enables its PX domain to bind to phosphoinositides, the interaction of which plays a crucial role in recruitment of p47(phox) from the cytoplasm to membranes and subsequent activation of the phagocyte oxidase. The lipid-binding activity of this protein is normally suppressed by intramolecular interaction of the PX domain with the C-terminal Src homology 3 (SH3) domain.

    The PX domain is conserved from yeast to human. A recent multiple alignment of representative PX domain sequences can be found in, although showing relatively little sequence conservation, their structure appears to be highly conserved. Although phosphatidylinositol-3-phosphate (PtdIns(3)P) is the primary target of PX domains, binding to phosphatidic acid, phosphatidylinositol-3,4-bisphosphate (PtdIns(3,4)P2), phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2), phosphatidylinositol-4,5-bisphosphate (PtdIns(4,5)P2), and phosphatidylinositol-3,4,5-trisphosphate (PtdIns(3,4,5)P3) has been reported as well. The PX-domain is also a protein-protein interaction domain.

    Proteins where this domain is known:
    PF07_0017   


    SM00316 - S1 (Smart link)

    Interpro entry IPR003029 : S1, RNA binding (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The S1 domain was originally identified in ribosomal protein S1 but is found in a large number of RNA-associated proteins. The structure of the S1 RNA-binding domain from the Escherichia coli polynucleotide phosphorylase has been determined using NMR methods and consists of a five-stranded antiparallel beta barrel. Conserved residues on one face of the barrel and adjacent loops form the putative RNA-binding site.

    The structure of the S1 domain is very similar to that of cold shock proteins. This suggests that they may both be derived from an ancient nucleic acid-binding protein.

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    MAL13P1.36    MAL7P1.104    MAL8P1.101    MAL8P1.18    PF07_0117    PF10_0294    PF14_0658    PFB0215c    PFD0515w    PFE0830c   


    SM00317 - SET (Smart link)

    Interpro entry IPR001214 : (Interpro link)

    Interpro description:

    The SET domain appears generally as one part of a larger multidomain protein, and recently there were described three structures of very different proteins with distinct domain compositions: Neurospora crassa DIM-5, a member of the Su(var) family of HKMTs which methylate histone H3 on lysine 9,human SET7 (also called SET9), which methylates H3 on lysine 4 and garden pea Rubisco LSMT, an enzyme that does not modify histones, but instead methylates lysine 14 in the flexible tail of the large subunit of the enzyme Rubisco. The SET domain itself turned out to be an uncommon structure. Although in all three studies, electron density maps revealed the location of the AdoMet or AdoHcy cofactor, the SET domain bears no similarity at all to the canonical/AdoMet-dependent methyltransferase fold. Strictly conserved in the C-terminal motif of the SET domain tyrosine could be involved in abstracting a proton from the protonated amino group of the substrate lysine, promoting its nucleophilic attack on the sulphonium methyl group of the AdoMet cofactor. In contrast to the AdoMet-dependent protein methyltranferases of the classical type, which tend to bind their polypeptide substrates on top of the cofactor, it is noted from the Rubisco LSMT structure that the AdoMet seems to bind in a separate cleft, suggesting how a polypeptide substrate could be subjected to multiple rounds of methylation without having to be released from the enzyme. In contrast, SET7/9 is able to add only a single methyl group to its substrate. It has been demonstrated that association of SET domain and myotubularin-related proteins modulates growth control. The SET domain-containing Drosophila melanogaster (Fruit fly) protein, enhancer of zeste, has a function in segment determination and the mammalian homologue may be involved in the regulation of gene transcription and chromatin structure.

    Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.

    The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.

    The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.

    Proteins where this domain is known:
    MAL13P1.122    PF08_0012    PF11_0160    PF13_0293    PFD0190w    PFF1440w    PFI0485c    PFL0690c   


    SM00318 - SNc (Smart link)

    Interpro entry IPR006021 : Staphylococcal nuclease (SNase-like) (Interpro link)

    Interpro description:

    Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has multidomain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.

    Proteins where this domain is known:
    PF11_0374   


    SM00320 - WD40 (Smart link)

    Interpro entry IPR001680 : (Interpro link)

    Interpro description:

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    Proteins where this domain is known:
    MAL13P1.142    MAL13P1.148    MAL13P1.245    MAL13P1.264    MAL13P1.385    MAL13P1.54    MAL13P1.79    MAL7P1.81    MAL8P1.139    MAL8P1.145    MAL8P1.43    PF07_0017    PF07_0092    PF07_0106    PF08_0019    PF08_0065    PF08_0130    PF08_0135    PF10_0044    PF10_0045    PF10_0126    PF10_0128    PF10_0196    PF10_0261    PF10_0285    PF10_0326    PF11_0056    PF11_0171    PF11_0195    PF11_0222    PF11_0252    PF11_0400    PF11_0471    PF13_0149    PF13_0184    PF13_0250    PF13_0309    PF13_0335    PF14_0055    PF14_0062    PF14_0087    PF14_0101    PF14_0243    PF14_0263    PF14_0314    PF14_0412    PF14_0456    PF14_0565    PF14_0640    PFA0520c    PFB0640c    PFC0100c    PFC0365w    PFC0965w    PFD0455w    PFE0090w    PFE0505w    PFE0540w    PFE0930w    PFE1270c    PFE1310c    PFF0330w    PFF0395c    PFF1000w    PFF1480w    PFI0275w    PFI0290c    PFI1080w    PFL0470w    PFL0610w    PFL0970w    PFL1040w    PFL1290w    PFL1395c    PFL1470c    PFL1480w    PFL1820w    PFL1975c    PFL2105c    PFL2460w   


    SM00322 - KH (Smart link)

    Interpro entry IPR004087 : K Homology (Interpro link)

    Interpro description:

    The K homology (KH) domain was first identified in the human heterogeneous nuclear ribonucleoprotein (hnRNP) K. An evolutionarily conserved sequence of around 70 amino acids, the KH domain is present in a wide variety of nucleic acid-binding proteins. The KH domain binds RNA, and can function in RNA recognition. It is found in multiple copies in several proteins, where they can function cooperatively or independently. For example, in the AU-rich element RNA-binding protein KSRP, which has 4 KH domains, KH domains 3 and 4 behave as independent binding modules to interact with different regions of the AU-rich RNA targets. The solution structure of the first KH domain of FMR1 and of the C-terminal KH domain of hnRNP K determined by nuclear magnetic resonance (NMR) revealed a beta-alpha-alpha-beta-beta-alpha structure. Proteins containing KH domains include:

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    PF10_0115    PF14_0151    PF14_0627    PF14_0661    PFB0370c    PFC0130c    PFE0500c    PFF0250w    PFF1135w   


    SM00324 - RhoGAP (Smart link)

    Interpro entry IPR000198 : RhoGAP (Interpro link)

    Interpro description:
    Members of the Rho family of small G proteins transduce signals from plasma-membrane receptors and control cell adhesion, motility and shape by actin cytoskeleton formation. Like all other GTPases, Rho proteins act as molecular switches, with an active GTP-bound form and an inactive GDP-bound form. The active conformation is promoted by guanine-nucleotide exchange factors, and the inactive state by GTPase-activating proteins (GAPs) which stimulate the intrinsic GTPase activity of small G proteins. This entry is a Rho/Rac/Cdc42-like GAP domain, that is found in a wide variety of large, multi-functional proteins. A number of structure are known for this family. The domain is composed of seven alpha helices. This domain is also known as the breakpoint cluster region-homology (BH) domain.

    Proteins where this domain is known:
    PF10_0071   


    SM00327 - VWA (Smart link)

    Interpro entry IPR002035 : (Interpro link)

    Interpro description:
    The von Willebrand factor is a large multimeric glycoprotein found in blood plasma. Mutant forms are involved in the aetiology of bleeding disorders . In von Willebrand factor, the type A domain (vWF) is the prototype for a protein superfamily. The vWF domain is found in various plasma proteins: complement factors B, C2, CR3 and CR4; the integrins (I-domains); collagen types VI, VII, XII and XIV; and other extracellular proteins. Although the majority of VWA-containing proteins are extracellular, the most ancient ones present in all eukaryotes are all intracellular proteins involved in functions such as transcription, DNA repair, ribosomal and membrane transport and the proteasome. A common feature appears to be involvement in multiprotein complexes. Proteins that incorporate vWF domains participate in numerous biological events (e.g. cell adhesion, migration, homing, pattern formation, and signal transduction), involving interaction with a large array of ligands. A number of human diseases arise from mutations in VWA domains. Secondary structure prediction from 75 aligned vWF sequences has revealed a largely alternating sequence of alpha-helices and beta-strands. Fold recognition algorithms were used to score sequence compatibility with a library of known structures: the vWF domain fold was predicted to be a doubly-wound, open, twisted beta-sheet flanked by alpha-helices. 3D structures have been determined for the I-domains of integrins CD11b (with bound magnesium) and CD11a (with bound manganese). The domain adopts a classic alpha/beta Rossmann fold and contains an unusual metal ion coordination site at its surface. It has been suggested that this site represents a general metal ion-dependent adhesion site (MIDAS) for binding protein ligands. The residues constituting the MIDAS motif in the CD11b and CD11a I-domains are completely conserved, but the manner in which the metal ion is coordinated differs slightly.

    Proteins where this domain is known:
    MAL13P1.76    PF08_0109    PF08_0136b    PF13_0201    PF14_0326    PFC0640w    PFF0800w   


    SM00331 - PP2C_SIG (Smart link)

    Interpro entry IPR001932 : Protein phosphatase 2C-related (Interpro link)

    Interpro description:

    This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase.

    Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1, which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.

    PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase(PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.

    Proteins where this domain is known:
    MAL13P1.44    MAL8P1.109    PF11_0362    PFD0505c    PFL0445w    PFL2365w   


    SM00332 - PP2Cc (Smart link)

    Interpro entry IPR001932 : Protein phosphatase 2C-related (Interpro link)

    Interpro description:

    This domain is found in protein phosphatase 2C, as well as other proteins eg. pyruvate dehydrogenase (lipoamide)]-phosphatase and adenylate cyclase.

    Protein phosphatase 2C (PP2C) is one of the four major classes of mammalian serine/threonine specific protein phosphatases. PP2C is a monomeric enzyme of about 42 Kd which shows broad substrate specificity and is dependent on divalent cations (mainly manganese and magnesium) for its activity. Its exact physiological role is still unclear. Three isozymes are currently known in mammals: PP2C-alpha, -beta and -gamma. In yeast, there are at least four PP2C homologs: phosphatase PTC1, which has weak tyrosine phosphatase activity in addition to its activity on serines, phosphatases PTC2 and PTC3, and hypothetical protein YBR125c. Isozymes of PP2C are also known from Arabidopsis thaliana (ABI1, PPH1), Caenorhabditis elegans (FEM-2, F42G9.1, T23F11.1), Leishmania chagasi and Paramecium tetraurelia. In A. thaliana, the kinase associated protein phosphatase (KAPP) is an enzyme that dephosphorylates the Ser/Thr receptor-like kinase RLK5 and which contains a C-terminal PP2C domain.

    PP2C does not seem to be evolutionary related to the main family of serine/ threonine phosphatases: PP1, PP2A and PP2B. However, it is significantly similar to the catalytic subunit of pyruvate dehydrogenase phosphatase(PDPC), which catalyzes dephosphorylation and concomitant reactivation of the alpha subunit of the E1 component of the pyruvate dehydrogenase complex. PDPC is a mitochondrial enzyme and, like PP2C, is magnesium-dependent.

    Proteins where this domain is known:
    MAL13P1.44    MAL8P1.108    MAL8P1.109    PF07_0019    PF11_0362    PF11_0396    PF14_0523    PFD0505c    PFE1010w    PFF0770c    PFL0445w    PFL2365w   


    SM00333 - TUDOR (Smart link)

    Interpro entry IPR002999 : Tudor (Interpro link)

    Interpro description:

    The drosophila tudor protein is encoded by a 'posterior group' gene, which when mutated disrupt normal abdominal segmentation and pole cell formation. Another drosophila gene, homeless, is required for RNA localization during oogenesis. The tudor protein contains multiple repeats of a domain which is also found in homeless.

    The tudor domain is found in many proteins that colocalise with ribonucleoprotein or single-strand DNA-associated complexes in the nucleus, in the mitochondrial membrane, or at kinetochores. It is not known whether the domain binds directly to RNA and ssDNA, or controls interactions with the nucleoprotein complexes. At least one tudor-containing protein, homeless, also contains a zinc finger typical of RNA-binding proteins.

    The resolution of the solution structure of the Tudor domain of human SMN revealed that the Tudor domain forms a strongly bent antiparallel beta-sheet with five strands forming a barrel-like fold. The structure exhibits a conserved negatively charged surface that interacts with the C-terminal Arg and Gly-rich tails of the spliceosomal Sm D1 and D3 proteins.

    Proteins where this domain is known:
    PF11_0374    PFC1050w   


    SM00336 - BBOX (Smart link)

    Interpro entry IPR000315 : Zinc finger, B-box (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents B-box-type zinc finger domains, which are around 40 residues in length. B-box zinc fingers can be divided into two groups, where types 1 and 2 B-box domains differ in their consensus sequence and in the spacing of the 7-8 zinc-binding residues. Several proteins contain both types 1 and 2 B-boxes, suggesting some level of cooperativity between these two domains. B-box domains are found in over 1500 proteins from a variety of organisms. They are found in TRIM (tripartite motif) proteins that consist of an N-terminal RING finger (originally called an A-box), followed by 1-2 B-box domains and a coiled-coil domain (also called RBCC for Ring, B-box, Coiled-Coil). TRIM proteins contain a type 2 B-box domain, and may also contain a type 1 B-box. In proteins that do not contain RING or coiled-coil domains, the B-box domain is primarily type 2. Many type 2 B-box proteins are involved in ubiquitinylation. Proteins containing a B-box zinc finger domain include transcription factors, ribonucleoproteins and proto-oncoproteins; for example, MID1, MID2, TRIM9, TNL, TRIM36, TRIM63, TRIFIC, NCL1 and CONSTANS-like proteins.

    The microtubule-associated E3 ligase MID1 contains a type 1 B-box zinc finger domain. MID1 specifically binds Alpha-4, which in turn recruits the catalytic subunit of phosphatase 2A (PP2Ac). This complex is required for targeting of PP2Ac for proteasome-mediated degradation. The MID1 B-box coordinates two zinc ions and adopts a beta/beta/alpha cross-brace structure similar to that of ZZ, PHD, RING and FYVE zinc fingers.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFE0895c   


    SM00343 - ZnF_C2HC (Smart link)

    Interpro entry IPR001878 : Zinc finger, CCHC-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the CysCysHisCys (CCHC) type zinc finger domains, and have the sequence:

    where X can be any amino acid, and number indicates the number of residues. These 18 residues CCHC zinc finger domains are mainly found in the nucleocapsid protein of retroviruses. It is required for viral genome packaging and for early infection process. It is also found in eukaryotic proteins involved in RNA binding or single-stranded DNA binding.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF14_0139    PFE0425w    PFE1390w    PFF0500c    PFF1135w    PFI0480w    PFL1990c   


    SM00350 - MCM (Smart link)

    Interpro entry IPR001208 : DNA-dependent ATPase MCM (Interpro link)

    Interpro description:

    MCM proteins are DNA-dependent ATPases required for the initiation of eukaryotic DNA replication. In eukaryotes there is a family of six proteins, MCM2 to MCM7. They were first identified in yeast where most of them have a direct role in the initiation of chromosomal DNA replication by interacting directly with autonomously replicating sequences (ARS). They were thus called minichromosome maintenance proteins, MCM proteins.

    This family is also present in the archebacteria in 1 to 4 copies. Methanocaldococcus jannaschii (Methanococcus jannaschii) has four members, MJ0363, MJ0961, MJ1489 and MJECL13.

    The "MCM motif" contains Walker-A and Walker-B type nucleotide binding motifs. The diagnostic sequence defining the MCMs is IDEFDKM. Only Mcm2 (aka Cdc19 or Nda1) has been subjected to mutational analysis in this region, and most mutations abolish its activity. The presence of a putative ATP-binding domain implies that these proteins may be involved in an ATP-consuming step in the initiation of DNA replication in eukaryotes.

    The MCM proteins bind together in a large complex. Within this complex, individual subunits associate with different affinities, and there is a tightly associated core of Mcm4 (Cdc21), Mcm6 (Mis5) and Mcm7. This core complex in human MCMs has been associated with helicase activity in vitro, leading to the suggestion that the MCM proteins are the eukaryotic replicative helicase.

    Schizosaccharomyces pombe (Fission yeast) MCMs, like those in metazoans, are found in the nucleus throughout the cell cycle. This is in contrast to the Saccharomyces cerevisiae (Baker's yeast) in which MCM proteins move in and out of the nucleus during each cell cycle. The assembly of the MCM complex in S. pombe is required for MCM localisation, ensuring that only intact MCM complexes remain in the nucleus.

    Proteins where this domain is known:
    PF07_0023    PF13_0095    PF13_0291    PF14_0177    PFD0790c    PFE1345c    PFL0560c    PFL0580w   


    SM00355 - ZnF_C2H2 (Smart link)

    Interpro entry IPR015880 : Zinc finger, C2H2-like (Interpro link)

    Interpro description:

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents zinc finger domains resembling the C2H2-type.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF10_0058    PF10_0091    PF10_0313    PF11_0192    PF13_0278    PF14_0479    PF14_0559    PF14_0612    PF14_0643    PF14_0657    PF14_0707    PFC0690c    PFD0160w    PFD0375w    PFD0485w    PFI1215w    PFL0455c    PFL0465c    PFL1495w    PFL2075c   


    SM00356 - ZnF_C3H1 (Smart link)

    Interpro entry IPR000571 : Zinc finger, CCCH-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents C-x8-C-x5-C-x3-H (CCCH) type Zinc finger (Znf) domains. Proteins containing CCCH Znf domains include Znf proteins from eukaryotes involved in cell cycle or growth phase-related regulation, e.g. human TIS11B (butyrate response factor 1), a probable regulatory protein involved in regulating the response to growth factors, and the mouse TTP growth factor-inducible nuclear protein, which has the same function. The mouse TTP protein is induced by growth factors. Another protein containing this domain is the human splicing factor U2AF 35 kD subunit, which plays a critical role in both constitutive and enhancer-dependent splicing by mediating essential protein-protein interactions and protein-RNA interactions required for 3' splice site selection. It has been shown that different CCCH-type Znf proteins interact with the 3'-untranslated region of various mRNA. This type of Znf is very often present in two copies.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF10_0083    PF10_0186    PF11_0200    PF11_0357    PF13_0314    PF14_0236    PF14_0416    PF14_0610    PF14_0652    PFE1145w    PFE1245w    PFI0325c    PFI1335w    PFL0510c    PFL2310w   


    SM00359 - PUA (Smart link)

    Interpro entry IPR002478 : PUA (Interpro link)

    Interpro description:

    The PUA (PseudoUridine synthase and Archaeosine transglycosylase) domain was named after the proteins in which it was first found. PUA is a highly conserved RNA-binding motif found in a wide range of archaeal, bacterial and eukaryotic proteins, including enzymes that catalyse tRNA and rRNA post-transcriptional modifications, proteins involved in ribosome biogenesis and translation, as well as in enzymes involved in proline biosynthesis. The structures of several PUA-RNA complexes reveal a common RNA recognition surface, but also some versatility in the way in which the motif binds to RNA. PUA motifs are involved in dyskeratosis congenita and cancer, pointing to links between RNA metabolism and human diseases.

    Proteins where this domain is known:
    PF14_0174    PF14_0481    PF14_0635    PFE1470w   


    SM00360 - RRM (Smart link)

    Interpro entry IPR000504 : RNA recognition motif, RNP-1 (Interpro link)

    Interpro description:

    Many eukaryotic proteins containing one or more copies of a putative RNA-binding domain of about 90 amino acids are known to bind single-stranded RNAs. The largest group of single strand RNA-binding proteins is the eukaryotic RNA recognition motif (RRM) family that contains an eight amino acid RNP-1 consensus sequence. RRM proteins have a variety of RNA binding preferences and functions, and include heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing (SR, U2AF, Sxl), protein components of small nuclear ribonucleoproteins (U1 and U2 snRNPs), and proteins that regulate RNA stability and translation (PABP, La, Hu). The RRM in heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) appears to have two RRM-like domains with specialised features for protein recognition. The motif also appears in a few single stranded DNA binding proteins.

    The typical RRM consists of four anti-parallel beta-strands and two alpha-helices arranged in a beta-alpha-beta-beta-alpha-beta fold with side chains that stack with RNA bases. Specificity of RNA binding is determined by multiple contacts with surrounding amino acids. A third helix is present during RNA binding in some cases. The RRM is reviewed in a number of publications.

    Proteins where this domain is known:
    MAL13P1.120    MAL13P1.303    MAL13P1.338    MAL13P1.35    MAL7P1.126    MAL7P1.157a    MAL8P1.40    MAL8P1.83    PF07_0066    PF08_0086    PF10_0028    PF10_0047    PF10_0068    PF10_0194    PF10_0214    PF10_0217    PF10_0235    PF11_0083    PF11_0111    PF11_0205    PF11_0279    PF11_0320    PF11_0330    PF11_0347    PF11_0402    PF13_0122    PF13_0147    PF13_0165    PF13_0278    PF13_0315    PF13_0318    PF14_0028    PF14_0056    PF14_0057    PF14_0096    PF14_0194    PF14_0433    PF14_0513    PF14_0656    PFC0865w    PFD0700c    PFD0750w    PFD0775c    PFE0160c    PFE0750c    PFE0865c    PFE0885w    PFF0150c    PFF0300w    PFF0320c    PFF0505c    PFF0760w    PFF1425w    PFI0820c    PFI1025w    PFI1175c    PFI1435w    PFI1600w    PFI1695c    PFL0375w    PFL0830w    PFL1170w    PFL1200c    PFL1705w    PFL1745c    PFL2130w    PFL2310w   


    SM00361 - RRM_1 (Smart link)

    Interpro entry IPR003954 : RNA recognition, region 1 (Interpro link)

    Interpro description:
    Many eukaryotic proteins that are known or supposed to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the eukaryotic putative RNA-binding region RNP-1 signature, or RNA recognition motif (RRM). RRMs are found in a variety of RNA binding proteins, including heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases. Two individual SMART models were built which identify subtypes of this domain, but there is no functional difference between the subtypes. This is one of the subtypes.

    Proteins where this domain is known:
    PF11_0200   


    SM00363 - S4 (Smart link)

    Interpro entry IPR002942 : RNA-binding S4 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The S4 domain is a small domain consisting of 60-65 amino acid residues that was detected in the bacterial ribosomal protein S4, eukaryotic ribosomal S9, two families of pseudouridine synthases, a novel family of predicted RNA methylases, a yeast protein containing a pseudouridine synthetase and a deaminase domain, bacterial tyrosyl-tRNA synthetases, and a number of uncharacterised, small proteins that may be involved in translation regulation. The S4 domain probably mediates binding to RNA.

    Proteins where this domain is known:
    PF11_0065    PF14_0584    PFB0890c    PFE1005w    PFL1380w   


    SM00365 - no description (Smart link)

    Proteins where this domain is known:
    MAL13P1.238    PF14_0496    PF14_0785    PFE0455w    PFF0595c    PFL1360c   


    SM00368 - LRR_RI (Smart link)

    Interpro entry IPR003590 : (Interpro link)

    Interpro description:

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    Reaction of amidotransferase domain:

      L-glutamine + H2O = L-glutamate + NH3 

    Reactions of FMN-binding domain:

      2-oxoglutarate + NH3 = 2-iminoglutarate + H2O 
    2e + FMNox = FMNred  
    2-iminoglutarate + FMNred = L-glutamate + FMNox  

    The 3-D structure of ribonuclease inhibitor, a protein containing 15 LRRs, has been determined, revealing LRRs to be a new class of alpha/beta fold. LRRs form elongated non-globular structures and are often flanked by cysteine rich domains. This subtype is found in ribonuclease inhibitors.

    Proteins where this domain is known:
    PFL2380c   


    SM00369 - LRR_TYP (Smart link)

    Interpro entry IPR003591 : (Interpro link)

    Interpro description:

    Glutamate synthase (GltS)1 is a key enzyme in the early stages of the assimilation of ammonia in bacteria, yeasts, and plants. In bacteria, L-glutamate is involved in osmoregulation, is the precursor for other amino acids, and can be the precursor for haem biosynthesis. In plants, GltS is especially essential in the reassimilation of ammonia released by photorespiration. On the basis of the amino acid sequence and the nature of the electron donor, three different classes of GltS can de defined as follows: 1) ferredoxin-dependent GltS (Fd-GltS), 2) NADPH-dependent GltS (NADPH-GltS), and 3) NADH-dependent GltS (properties of the three classes have been reviewed extensively). The enzyme is a complex iron-sulphur flavoprotein catalysing the reductive transfer of the amido nitrogen from L-glutamine to 2-oxoglutarate to form two molecules of L-glutamate via intramolecular channelling of ammonia from the amidotransferase domain to the FMN-binding domain.

    Reaction of amidotransferase domain:

      L-glutamine + H2O = L-glutamate + NH3 

    Reactions of FMN-binding domain:

      2-oxoglutarate + NH3 = 2-iminoglutarate + H2O 
    2e + FMNox = FMNred  
    2-iminoglutarate + FMNred = L-glutamate + FMNox  

    This entry represents a most populated subfamily of leucine-rich repeats.

    Proteins where this domain is known:
    PF14_0305   


    SM00382 - AAA (Smart link)

    Interpro entry IPR003593 : ATPase, AAA+ type, core (Interpro link)

    Interpro description:

    AAA ATPases (ATPases Associated with diverse cellular Activities) form a large protein family and play a number of roles in the cell including cell-cycle regulation, protein proteolysis and disaggregation, organelle biogenesis and intracellular transport. Some of them function as molecular chaperones, subunits of proteolytic complexes or independent proteases (FtsH, Lon). They also act as DNA helicases and transcription factors..

    AAA ATPases belong to the AAA+ superfamily of ringshaped P-loop NTPases, which act via the energy-dependent unfolding of macromolecules. There are six major clades of AAA domains (proteasome subunits, metalloproteases, domains D1 and D2 of ATPases with two AAA domains, the MSP1/katanin/spastin group and BCS1 and it homologues), as well as a number of deeply branching minor clades.

    They assemble into oligomeric assemblies (often hexamers) that form a ring-shaped structure with a central pore. These proteins produce a molecular motor that couples ATP binding and hydrolysis to changes in conformational states that act upon a target substrate, either translocating or remodelling it.

    They are found in all living organisms and share the common feature of the presence of a highly conserved AAA domain called the AAA module. This domain is responsible for ATP binding and hydrolysis. It contains 200-250 residues, among them there are two classical motifs, Walker A (GX4GKT) and Walker B (HyDE).

    More information about these protein can be found at Protein of the Month: AAA ATPases. This entry represents the core domain of the AAA+ ATPases

    Proteins where this domain is known:
    MAL13P1.344    MAL7P1.162    MAL7P1.209    MAL8P1.144    MAL8P1.92    PF07_0023    PF07_0047    PF08_0063    PF08_0078    PF08_0100    PF08_0117    PF10_0081    PF11_0071    PF11_0087    PF11_0175    PF11_0203    PF11_0225    PF11_0240    PF11_0314    PF11_0405    PF11_0466    PF13_0033    PF13_0063    PF13_0218    PF13_0271    PF13_0330    PF13_0350    PF14_0063    PF14_0126    PF14_0147    PF14_0177    PF14_0244    PF14_0326    PF14_0370    PF14_0455    PF14_0477    PF14_0548    PF14_0601    PF14_0616    PFA0545c    PFA0590w    PFB0840w    PFB0895c    PFC0140c    PFC0260w    PFC0875w    PFD0385c    PFD0665c    PFD1060w    PFE1150w    PFE1345c    PFF0155w    PFF0940c    PFI0355c    PFL0150w    PFL0495c    PFL1410c    PFL1725w    PFL1925w    PFL2005w    PFL2345c   


    SM00384 - AT_hook (Smart link)

    Interpro entry IPR017956 : AT hook, DNA-binding, conserved site (Interpro link)

    Interpro description:

    AT hooks are DNA-binding motifs with a preference for A/T rich regions. These motifs are found in a variety of proteins, including the high mobility group (HMG) proteins, in DNA-binding proteins from plants and in hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex.

    High mobility group (HMG) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG-I and HMG-Y (HMGA) are proteins of about 100 amino acid residues which are produced by the alternative splicing of a single gene. HMG-I/Y proteins bind preferentially to the minor groove of AT-rich regions in double-stranded DNA in a non-sequence specific manner. It is suggested that these proteins could function in nucleosome phasing and in the 3' end processing of mRNA transcripts. They are also involved in the transcription regulation of genes containing, or in close proximity to, AT-rich regions.

    Proteins where this domain is known:
    PF10_0075    PF10_0079   


    SM00385 - CYCLIN (Smart link)

    Interpro entry IPR006670 : (Interpro link)

    Interpro description:

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.

    This domain is also found in transcription factor IIB (TFIIB) and retinoblastoma.

    Proteins where this domain is known:
    MAL8P1.152    PF13_0022    PF14_0469    PF14_0605    PFF0270c   


    SM00386 - HAT (Smart link)

    Interpro entry IPR003107 : RNA-processing protein, HAT helix (Interpro link)

    Interpro description:

    The HAT (Half A TPR) repeat has a repetitive pattern characterised by three aromatic residues with a conserved spacing. They are structurally and sequentially similar to TPRs (tetratricopeptide repeats), though they lack the highly conserved alanine and glycine residues found in TPRs. The number of HAT repeats found in different proteins varies between 9 and 12. HAT-repeat-containing proteins appear to be components of macromolecular complexes that are required for RNA processing. The repeats may be involved in protein-protein interactions. The HAT motif has striking structural similarities to HEAT repeats, being of a similar length and consisting of two short helices connected by a loop domain, as in HEAT repeats.

    Proteins where this domain is known:
    PF11_0108    PF14_0042    PFD0180c    PFE1320w    PFL1735c   


    SM00387 - HATPase_c (Smart link)

    Interpro entry IPR003594 : ATP-binding region, ATPase-like (Interpro link)

    Interpro description:

    This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF07_0029    PF11_0188    PF14_0316    PF14_0417    PF14_0649    PFL1070c    PFL1915w   


    SM00397 - t_SNARE (Smart link)

    Interpro entry IPR000727 : (Interpro link)

    Interpro description:

    The process of vesicular fusion with target membranes depends on a set of SNAREs (SNAP-Receptors), which are associated with the fusing membranes. Target SNAREs (t-SNAREs) are localised on the target membrane and belong to two different families, the syntaxin-like family and the SNAP-25 like family. One member of each family, together with a v-SNARE localised on the vesicular membrane, are required for fusion.

    The Syntaxins are type-I transmembrane proteins that contain several regions with coiled-coil propensity in their cytosolic part, the SNARE motif. SNAP-25 is a protein consisting of two coiled-coil regions, which is associated with the membrane by lipid anchors. SNARE motifs assemble into parallel four helix bundles stabilised by the burial of these hydrophobic helix faces in the bundle core. Monomeric SNARE motifs are disordered so this assembly reaction is accompanied by a dramatic increase in alpha-helical secondary structure. The parallel arrangement of SNARE motifs within complexes bring the transmembrane anchors, and the two membranes, into close proximity. Recently, it was shown that the two coiled-coil regions of SNAP-25 and one of the coiled-coil regions of the syntaxins are related. This domain is found in both Syntaxin and SNAP-25 families as well as in other proteins.

    Proteins where this domain is known:
    MAL13P1.113    MAL13P1.169    PFB0480w    PFL0505c    PFL2070w   


    SM00398 - HMG (Smart link)

    Interpro entry IPR000910 : High mobility group, HMG1/HMG2 (Interpro link)

    Interpro description:

    High mobility group (HMG or HMGB) proteins are a family of relatively low molecular weight non-histone components in chromatin. HMG1 (also called HMG-T in fish) and HMG2 are two highly related proteins that bind single-stranded DNA preferentially and unwind double-stranded DNA. Although they have no sequence specificity, they have a high affinity for bent or distorted DNA, and bend linear DNA. HMG1 and HMG2 contain two DNA-binding HMG-box domains (A and B) that show structural and functional differences, and have a long acidic C-terminal domain rich in aspartic and glutamic acid residues. The acidic tail modulates the affinity of the tandem HMG boxes in HMG1 and 2 for a variety of DNA targets. HMG1 and 2 appear to play important architectural roles in the assembly of nucleoprotein complexes in a variety of biological processes, for example V(D)J recombination, the initiation of transcription, and DNA repair.

    The profile in this entry describing the HMG-domains is much more general than the signature. In addition to the HMG1 and HMG2 proteins, HMG-domains occur in single or multiple copies in the following protein classes; the SOX family of transcription factors; SRY sex determining region Y protein and related proteins; LEF1 lymphoid enhancer binding factor 1; SSRP recombination signal recognition protein; MTF1 mitochondrial transcription factor 1; UBF1/2 nucleolar transcription factors; Abf2 yeast ARS-binding factor; and Saccharomyces cerevisiae transcription factors Ixr1, Rox1, Nhp6a, Nhp6b and Spp41.

    Proteins where this domain is known:
    MAL13P1.290    MAL8P1.72    PFL0145c    PFL0290w   


    SM00414 - H2A (Smart link)

    Interpro entry IPR002119 : Histone H2A (Interpro link)

    Interpro description:
    Histone H2A is a small, highly conserved nuclear protein that, together with 2 molecules each of histones H2B, H3 and H4, forms the eukaryotic nucleosome core; the nucleosome octamer winds ~146 DNA base-pairs.

    Proteins where this domain is known:
    PFC0920w    PFF0860c   


    SM00417 - H4 (Smart link)

    Interpro entry IPR001951 : Histone H4 (Interpro link)

    Interpro description:
    Histone H4 is one of the four histones, along with H2A, H2B and H3, which forms the eukaryotic nucleosome core. Along with H3, it plays a central role in nucleosome formation. The sequence of histone H4 has remained almost invariant in more then 2 billion years of evolution.

    Proteins where this domain is known:
    PF11_0061   


    SM00427 - H2B (Smart link)

    Interpro entry IPR000558 : Histone H2B (Interpro link)

    Interpro description:
    Histone H2B is one of the four histones, along with H2A, H3 and H4, which forms the eukaryotic nucleosome core. Histone H2B is a small, highly conserved nuclear protein that, together with 2 molecules each of histones H2A, H3 and H4, forms the eukaryotic nucleosome core; the nucleosome octamer winds ~146 DNA base-pairs.

    Proteins where this domain is known:
    PF07_0054    PF11_0062   


    SM00428 - H3 (Smart link)

    Interpro entry IPR000164 : Histone H3 (Interpro link)

    Interpro description:

    Histone H3 is one of the four histones, along with H2A, H2B and H4, which form the eukaryotic nucleosome octomer core; the nucleosome octamer winds ~146 DNA base-pairs. It is a highly conserved protein of 135 amino acid residues.

    Several proteins have been found to contain a C-terminal H3-like domain, including the mammalian centromeric protein CENP-A (which may act as a core histone necessary for the assembly of centromeres); yeast chromatin- associated protein CSE4; and Caenorhabditis elegans chromosome III proteins YL82_CAEEL and YMH3_CAEEL, whose function is unknown.

    Proteins where this domain is known:
    PF13_0185    PFF0510w    PFF0865w   


    SM00433 - TOP2c (Smart link)

    Interpro entry IPR001241 : DNA topoisomerase, type IIA, subunit B or N-terminal (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents subunit B (gyrB and parE) of bacterial gyrase and topoisomerase IV, and the equivalent N-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has ATPase and subunit interaction capacity.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316    PFL1915w   


    SM00434 - TOP4c (Smart link)

    Interpro entry IPR002205 : DNA topoisomerase, type IIA, subunit A or C-terminal (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent C-terminal region in eukaryotic topoisomerase II composed of a single polypeptide. This subunit has DNA-binding capacity.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316    PFL1120c   


    SM00435 - TOPEUc (Smart link)

    Interpro entry IPR013499 : DNA topoisomerase I, C-terminal, eukaryotic-type (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry represents the C-terminal region of DNA topoisomerase I enzymes from eukaryotes (type IB enzymes). This region covers both the catalytic core and the DNA-binding domains.

    Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. The crystal structures of human topoisomerase I comprising the core and carboxyl-terminal domains in covalent and noncovalent complexes with 22-base pair DNA duplexes reveal an enzyme that "clamps" around essentially B-form DNA. The core domain and the first eight residues of the carboxyl-terminal domain of the enzyme, including the active-site nucleophile tyrosine-723, share significant structural similarity with the bacteriophage family of DNA integrases. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PFE0520c   


    SM00436 - TOP1Bc (Smart link)

    Interpro entry IPR003601 : DNA topoisomerase, type IA, domain 2 (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry describes domain 2 found in type IA topoisomerases, which may be an extension of the Toprim domain. The structures of bacterial topoisomerases I and III have been shown to consist of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA. The N-terminal Toprim domain together with domain 3 forms the active site of the enzyme, while domains 2 and 4 form a single-strand DNA-binding groove. The Toprim domain () forms a compact Rossmann fold that coordinates the Mg+2 ion..

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF13_0251   


    SM00437 - TOP1Ac (Smart link)

    Interpro entry IPR003602 : DNA topoisomerase, type IA, DNA-binding (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry describes the DNA-binding domain (domain 3) found in type IA topoisomerases. The structures of bacterial topoisomerases I and III have been shown to consist of four domains that together form a toroidal structure with a central hole large enough to accommodate single- and double-stranded DNA. The N-terminal Toprim domain together with domain 3 (beta-barrel) forms the active site of the enzyme, while domains 2 and 4 (both winged-helix-like) form a single-strand DNA-binding groove. All topoisomerases cleave DNA by forming a transient phosphotyrosine bond; in type IA topoisomerases, the active site tyrosine is in domain 3.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF13_0251   


    SM00440 - ZnF_C2C2 (Smart link)

    Interpro entry IPR001222 : Zinc finger, TFIIS-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents a zinc finger motif found in transcription factor IIs (TFIIS). In eukaryotes the initiation of transcription of protein encoding genes by polymerase II (Pol II) is modulated by general and specific transcription factors. The general transcription factors operate through common promoters elements (such as the TATA box). At least eight different proteins associate to form the general transcription factors: TFIIA, -IIB, -IID, -IIE, -IIF, -IIG, -IIH and -IIS. During mRNA elongation, Pol II can encounter DNA sequences that cause reverse movement of the enzyme. Such backtracking involves extrusion of the RNA 3'-end into the pore, and can lead to transcriptional arrest. Escape from arrest requires cleavage of the extruded RNA with the help of TFIIS, which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS extends from the polymerase surface via a pore to the internal active site. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    TFIIS is a protein of about 300 amino acids. It contains three regions: a variable N-terminal domain not required for TFIIS activity; a conserved central domain required for Pol II binding; and a conserved C-terminal C4-type zinc finger essential for RNA cleavage. The zinc finger folds in a conformation termed a zinc ribbon characterised by a three-stranded antiparallel beta-sheet and two beta-hairpins. A backbone model for Pol II-TFIIS complex was obtained from X-ray analysis. It shows that a beta hairpin protrudes from the zinc finger and complements the pol II active site.

    Some viral proteins also contain the TFIIS zinc ribbon C-terminal domain. The Vaccinia virus protein, unlike its eukaryotic homologue, is an integral RNA polymerase subunit rather than a readily separable transcription factor.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF07_0057    PFB0290c    PFD0360w   


    SM00441 - FF (Smart link)

    Interpro entry IPR002713 : (Interpro link)

    Interpro description:
    The FF domain may be involved in protein-protein interaction. It often occurs as multiple copies and often accompanies WW domains PRP40 from yeast encodes a novel, essential splicing component that associates with the yeast U1 small nuclear ribonucleoprotein particle.

    Proteins where this domain is known:
    PF13_0091   


    SM00443 - G_patch (Smart link)

    Interpro entry IPR000467 : D111/G-patch (Interpro link)

    Interpro description:
    The D111/G-patch domain is a short conserved region of about 40 amino acids which occurs in a number of putative RNA-binding proteins, including tumor suppressor and DNA-damage-repair proteins, suggesting that this domain may have an RNA binding function. This domain has seven highly conserved glycines. A multiple alignment of a small subset of D111/G-patch domains is shown in Fig. 2b of.

    Proteins where this domain is known:
    PF14_0513    PFE1570c   


    SM00446 - LRRcap (Smart link)

    Interpro entry IPR003603 : (Interpro link)

    Interpro description:

    This motif occurs C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. Examples from the metazoa are described as either "Acidic leucine-rich nuclear phosphoprotein 32 family member A" or have been characterised as U2A', the protein that interacts with U2B'' facilitating the interaction with U2 snRNA. U2A' is required for the spliceosome assembly and the efficient addition of U2 snRNP onto the pre-mRNA. The crystal structure of the spliceosomal U2B"-U2A' protein complex bound to a fragment of U2 small nuclear RNA has been described.

    Proteins where this domain is known:
    PF10_0320   


    SM00449 - SPRY (Smart link)

    Interpro entry IPR018355 : (Interpro link)

    Interpro description:
    The SPRY domain is of unknown function. Distant homologues are domains in butyrophilin/marenostrin/pyrin. Ca2+-release from the sarcoplasmic or endoplasmic reticulum, the intracellular Ca2+ store, is mediated by the ryanodine receptor (RyR) and/or the inositol trisphosphate receptor (IP3R).

    Proteins where this domain is known:
    PF08_0021    PFE1085w   


    SM00450 - RHOD (Smart link)

    Interpro entry IPR001763 : (Interpro link)

    Interpro description:

    Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including

    Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.

    Proteins where this domain is known:
    PF13_0027    PFL0320w   


    SM00451 - ZnF_U1 (Smart link)

    Interpro entry IPR003604 : Zinc finger, U1-type (Interpro link)

    Interpro description:

    C2H2-type (classical) zinc fingers (Znf) were the first class to be characterised. They contain a short beta hairpin and an alpha helix (beta/beta/alpha structure), where a single zinc atom is held in place by Cys(2)His(2) (C2H2) residues in a tetrahedral array. C2H2 Znf's can be divided into three groups based on the number and pattern of fingers: triple-C2H2 (binds single ligand), multiple-adjacent-C2H2 (binds multiple ligands), and separated paired-C2H2. C2H2 Znf's are the most common DNA-binding motifs found in eukaryotic transcription factors, and have also been identified in prokaryotes. Transcription factors usually contain several Znf's (each with a conserved beta/beta/alpha structure) capable of making multiple contacts along the DNA, where the C2H2 Znf motifs recognise DNA sequences by binding to the major groove of DNA via a short alpha-helix in the Znf, the Znf spanning 3-4 bases of the DNA. C2H2 Znf's can also bind to RNA and protein targets.

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents U1-type zinc finger domains, a family of C2H2-type zinc fingers present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF08_0084    PF14_0612    PFF0970w    PFL2075c   


    SM00454 - SAM (Smart link)

    Interpro entry IPR001660 : (Interpro link)

    Interpro description:

    The sterile alpha motif (SAM) domain is a putative protein interaction module present in a wide variety of proteins involved in many biological processes. The SAM domain that spreads over around 70 residues is found in diverse eukaryotic organisms. SAM domains have been shown to homo- and hetero-oligomerise, forming multiple self-association architectures and also binding to various non-SAM domain-containing proteins, nevertheless with a low affinity constant. SAM domains also appear to possess the ability to bind RNA. Smaug  a protein that helps to establish a morphogen gradient in Drosophila embryos by repressing the translation of nanos (nos) mRNA  binds to the 3' untranslated region (UTR) of nos mRNA via two similar hairpin structures. The 3D crystal structure of the Smaug RNA-binding region shows a cluster of positively charged residues on the Smaug-SAM domain, which could be the RNA-binding surface. This electropositive potential is unique among all previously determined SAM-domain structures and is conserved among Smaug-SAM homologs. These results suggest that the SAM domain might have a primary role in RNA binding.

    Structural analyses show that the SAM domain is arranged in a small five-helix bundle with two large interfaces. In the case of the SAM domain of EphB2, each of these interfaces is able to form dimers. The presence of these two distinct intermonomers binding surface suggest that SAM could form extended polymeric structures.

    Proteins where this domain is known:
    PF11_0079   


    SM00456 - WW (Smart link)

    Interpro entry IPR001202 : WW/Rsp5/WWP (Interpro link)

    Interpro description:

    Synonym(s): Rsp5 or WWP domain

    The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.

    A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.

    Proteins where this domain is known:
    MAL8P1.40    PF13_0091    PF13_0315    PF14_0096    PFL1745c   


    SM00471 - HDc (Smart link)

    Interpro entry IPR003607 : Metal-dependent phosphohydrolase, HD region (Interpro link)

    Interpro description:
    The HD domain is found in a superfamily of enzymes with a predicted or known phosphohydrolase activity. These enzymes appear to be involved in the nucleic acid metabolism, signal transduction and possibly other functions in bacteria, archaea and eukaryotes. The fact that all the highly conserved residues in the HD superfamily are histidines or aspartates suggests that coordination of divalent cations is essential for the activity of these proteins. This domain is also found in eukaryotic 3',5'-cGMP phosphodiesterase (PDE), which is located in photoreceptor outer segments and it is light activated, playing a pivotal role in signal transduction. This profile/HMM does not detect HD homologues in bacterial glycine aminoacyl-tRNA synthetases (beta subunit).

    Proteins where this domain is known:
    MAL13P1.118    MAL13P1.119    PF14_0672    PFL0475w    PFL1890c   


    SM00474 - 35EXOc (Smart link)

    Interpro entry IPR002562 : 3'-5' exonuclease (Interpro link)

    Interpro description:

    This domain is responsible for the 3'-5' exonuclease proofreading activity of Escherichia coli DNA polymerase I (polI) and other enzymes, it catalyses the hydrolysis of unpaired or mismatched nucleotides. This domain consists of the amino-terminal half of the Klenow fragment in E. coli polI it is also found in the Werner syndrome helicase (WRN), focus forming activity 1 protein (FFA-1) and ribonuclease D (RNase D).

    Proteins where this domain is known:
    MAL8P1.35    PF14_0112    PF14_0473    PFA0290w    PFB0215c   


    SM00475 - 53EXOc (Smart link)

    Interpro entry IPR002421 : 5'-3' exonuclease (Interpro link)

    Interpro description:

    The N-terminal and internal 5'3'-exonuclease domains are commonly found together, and are most often associated with 5' to 3' nuclease activities. The XPG protein signatures are never found outside the '53EXO' domains. The latter are found in more diverse proteins. The number of amino acids that separate the two 53EXO domains, and the presence of accompanying motifs allow the diagnosis of several protein families.

    In the eubacterial type A DNA-polymerases, the N-terminal and internal domains are separated by a few amino acids, usually four. The pattern DNA_POLYMERASE_A is always present towards the C-terminus. Several eukaryotic structure-dependent endonucleases and exonucleases have the 53EXO domains separated by 24 to 27 amino acids, and the XPG protein signatures are always present. In several proteins from herpesviridae, the two 53EXO domains are separated by 50 to 120 amino acids. These proteins are implicated in the inhibition of the expression of the host genes. Eukaryotic DNA repair proteins with 600 to 700 amino acids between the 53_EXO domains all carry the XPG protein signatures.

    Proteins where this domain is known:
    PFB0180w    PFD0420c   


    SM00478 - ENDO3c (Smart link)

    Interpro entry IPR003265 : HhH-GPD domain (Interpro link)

    Interpro description:

    Endonuclease III is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair. The 3-D structures of Escherichia coli endonuclease III and catalytic domain of MutY have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL). Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs (see. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif.

    The HhH-GPD domain gets its name from its hallmark helix-hairpin-helix and Gly/Pro rich loop followed by a conserved aspartate. This domain is found in a diverse range of structurally related DNA repair proteins that include: endonuclease III,and DNA glycosylase MutY, an A/G-specific adenine glycosylase. Both of these enzymes have a C terminal iron-sulphur cluster loop (FCL). The methyl-CPG binding protein (MBD4) also contain a related domain that is a thymine DNA glycosylase. The family also includes DNA-3-methyladenine glycosylase II 8-oxoguanine DNA glycosylases and other members of the AlkA family.

    Proteins where this domain is known:
    PF11_0306    PFF0715c    PFI0835c   


    SM00479 - EXOIII (Smart link)

    Interpro entry IPR006055 : Exonuclease (Interpro link)

    Interpro description:
    This entry includes a variety of exonuclease proteins, such as ribonuclease T and the epsilon subunit of DNA polymerase III. Ribonuclease T is responsible for the end-turnover of tRNA,and removes the terminal AMP residue from uncharged tRNA. DNA polymerase III is a complex, multichain enzyme responsible for most of the replicative synthesis in bacteria, and also exhibits 3' to 5' exonuclease activity.

    Proteins where this domain is known:
    PF13_0208   


    SM00482 - POLAc (Smart link)

    Interpro entry IPR001098 : DNA-directed DNA polymerase, family A (Interpro link)

    Interpro description:
    Synonym(s): DNA nucleotidyltransferase (DNA-directed)

    DNA-directed DNA polymerases are the key enzymes catalysing the accurate replication of DNA. They require either a small RNA molecule or a protein as a primer for the de novo synthesis of a DNA chain. A number of polymerases belong to this family.

    Proteins where this domain is known:
    PF14_0112    PFF1225c   


    SM00484 - XPGI (Smart link)

    Interpro entry IPR006086 : XPG I (Interpro link)

    Interpro description:

    Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    Proteins where this domain is known:
    PF07_0105    PFB0265c    PFD0420c   


    SM00485 - XPGN (Smart link)

    Interpro entry IPR006085 : XPG N-terminal (Interpro link)

    Interpro description:

    Xeroderma pigmentosum (XP) is a human autosomal recessive disease, characterised by a high incidence of sunlight-induced skin cancer. People's skin cells with this condition are hypersensitive to ultraviolet light, due to defects in the incision step of DNA excision repair. There are a minimum of seven genetic complementation groups involved in this pathway: XP-A to XP-G. XP-G is one of the most rare and phenotypically heterogeneous of XP, showing anything from slight to extreme dysfunction in DNA excision repair. XP-G can be corrected by a 133 Kd nuclear protein, XPGC. XPGC is an acidic protein that confers normal UV resistance in expressing cells. It is a magnesium-dependent, single-strand DNA endonuclease that makes structure-specific endonucleolytic incisions in a DNA substrate containing a duplex region and single-stranded arms. XPGC cleaves one strand of the duplex at the border with the single-stranded region.

    XPG belongs to a family of proteins that includes RAD2 from Saccharomyces cerevisiae (Baker's yeast) and rad13 from Schizosaccharomyces pombe (Fission yeast), which are single-stranded DNA endonucleases; mouse and human FEN-1, a structure-specific endonuclease; RAD2 from fission yeast and RAD27 from budding yeast; fission yeast exo1, a 5'-3' double-stranded DNA exonuclease that may act in a pathway that corrects mismatched base pairs; yeast DHS1, and yeast DIN7. Sequence alignment of this family of proteins reveals that similarities are largely confined to two regions. The first is located at the N-terminal extremity (N-region) and corresponds to the first 95 to 105 amino acids. The second region is internal (I-region) and found towards the C-terminus; it spans about 140 residues and contains a highly conserved core of 27 amino acids that includes a conserved pentapeptide (E-A-[DE]-A-[QS]). It is possible that the conserved acidic residues are involved in the catalytic mechanism of DNA excision repair in XPG. The amino acids linking the N- and I-regions are not conserved.

    This entry represents the N terminal of XPG.

    Proteins where this domain is known:
    PF07_0105    PFB0265c    PFD0420c   


    SM00486 - POLBc (Smart link)

    Interpro entry IPR006172 : DNA-directed DNA polymerase, family B (Interpro link)

    Interpro description:

    DNA is the biological information that instructs cells how to exist in an ordered fashion: accurate replication is thus one of the most important events in the life cycle of a cell. This function is performed by DNA- directed DNA-polymerases by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA, using a complementary DNA chain as a template. Small RNA molecules are generally used as primers for chain elongation, although terminal proteins may also be used for the de novo synthesis of a DNA chain. Even though there are 2 different methods of priming, these are mediated by 2 very similar polymerases classes, A and B, with similar methods of chain elongation.

    A number of DNA polymerases have been grouped under the designation of DNA polymerase family B. Six regions of similarity (numbered from I to VI) are found in all or a subset of the B family polymerases. The most conserved region (I) includes a conserved tetrapeptide with two aspartate residues. Its function is not yet known. However, it has been suggested that it may be involved in binding a magnesium ion. All sequences in the B family contain a characteristic DTDS motif, and possess many functional domains, including a 5'-3' elongation domain, a 3'-5' exonuclease domain, a DNA binding domain, and binding domains for both dNTP's and pyrophosphate.

    Proteins where this domain is known:
    PF10_0165    PF10_0362    PFD0590c    PFF1470c   


    SM00487 - DEXDc (Smart link)

    Interpro entry IPR014001 : (Interpro link)

    Interpro description:

    This entry is found in DEAD and DEAH box helicases. Helicases are involved in unwinding nucleic acids. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.

    Proteins where this domain is known:
    MAL13P1.14    MAL13P1.166    MAL13P1.216    MAL13P1.322    MAL7P1.113    MAL7P1.12    MAL8P1.19    MAL8P1.65    PF08_0042    PF08_0048    PF08_0096    PF08_0111    PF08_0126    PF10_0209    PF10_0232    PF10_0294    PF10_0309    PF10_0369    PF11_0053    PF11_0077    PF13_0037    PF13_0077    PF13_0177    PF13_0308    PF14_0081    PF14_0183    PF14_0185    PF14_0234    PF14_0278    PF14_0370    PF14_0429    PF14_0436    PF14_0563    PF14_0655    PFA0180w    PFB0445c    PFB0730w    PFB0860c    PFC0440c    PFC0915w    PFC0955w    PFD0245c    PFD0565c    PFD1060w    PFD1070w    PFE0205w    PFE0215w    PFE0430w    PFE0925c    PFE1085w    PFE1390w    PFF0100w    PFF0225w    PFF1185w    PFF1500c    PFI0165c    PFI0480w    PFI0860c    PFI0910w    PFL0100c    PFL1310c    PFL1525c    PFL2010c    PFL2440w    PFL2475w   


    SM00488 - DEXDc2 (Smart link)

    Interpro entry IPR006554 : Helicase-like, DEXD box c2 type (Interpro link)

    Interpro description:

    This domain of unknown function is found in the Xeroderma pigmentosum group D (XPD) proteins which belong to a family of ATP-dependent helicases characterised by a 'D-E-A-H' motif. This resembles the 'D-E-A-D-box' of other known helicases, which represents a special version of the B motif of ATP-binding proteins. In XPD, His replaces the second Asp. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre-mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.

    Proteins where this domain is known:
    MAL13P1.134    PFI1650w   


    SM00490 - HELICc (Smart link)

    Interpro entry IPR001650 : DNA/RNA helicase, C-terminal (Interpro link)

    Interpro description:

    The domain, which defines this group of proteins is found in a wide variety of helicases and helicase related proteins. It may be that this is not an autonomously folding unit, but an integral part of the helicase.

    The eukaryotic translation initiation factor 4A (eIF4A) is a member of the DEA(D/H)-box RNA helicase family This is a diverse group of proteins that couples an ATPase activity to RNA binding and unwinding. The structure of the carboxyl-terminal domain of eIF4A has been determined to 1.75 A resolution; it has a parallel alpha-beta topology that superimposes, with minor variations, on the structures and conserved motifs of the equivalent domain in other, distantly related helicases.

    Proteins where this domain is known:
    MAL13P1.14    MAL13P1.166    MAL13P1.216    MAL13P1.322    MAL7P1.113    MAL7P1.201    MAL8P1.19    MAL8P1.65    PF08_0042    PF08_0048    PF08_0096    PF08_0111    PF08_0126    PF10_0209    PF10_0232    PF10_0294    PF10_0309    PF10_0369    PF11_0053    PF11_0077    PF13_0037    PF13_0177    PF13_0308    PF14_0183    PF14_0185    PF14_0234    PF14_0278    PF14_0370    PF14_0429    PF14_0437    PF14_0563    PF14_0655    PFA0180w    PFB0445c    PFB0730w    PFB0860c    PFC0440c    PFC0915w    PFC0955w    PFD0245c    PFD1060w    PFD1070w    PFE0205w    PFE0215w    PFE0430w    PFE0925c    PFE1085w    PFE1390w    PFF0100w    PFF1140c    PFF1185w    PFF1500c    PFI0480w    PFI0860c    PFI0910w    PFL0100c    PFL1310c    PFL1525c    PFL2010c    PFL2440w    PFL2475w   


    SM00491 - HELICc2 (Smart link)

    Interpro entry IPR006555 : Helicase, ATP-dependent, c2 type (Interpro link)

    Interpro description:

    This domain of unknown function is found at the C-terminal of some ATP-dependent helicases characterised by a 'D-E-A-H' motif. This resembles the 'D-E-A-D-box' of other known helicases, a special version of the B motif of ATP-binding proteins however His replaces the second Asp. The DEAD box helicases are involved in various aspects of RNA metabolism, including nuclear transcription, pre-mRNA splicing, ribosome biogenesis, nucleocytoplasmic transport, translation, RNA decay and organellar gene expression.

    Proteins where this domain is known:
    MAL13P1.134    PF14_0081    PFI1650w   


    SM00493 - TOPRIM (Smart link)

    Interpro entry IPR006154 : Toprim subdomain (Interpro link)

    Interpro description:

    The toprim (topoisomerase-primase) domain is a conserved region from DnaG primases, topoisomerases, OLD family nucleases and RecR/M DNA repair proteins. The fold of the TOPRIM domain resembles a Rossman-like nucleotide binding fold, with a central beta-sheet formed by 4 parallel beta-strands flanked by 3 alpha-helices. Only 5 residues are conserved across all TOPRIM domain, 2 of these are glycines which may play a structural role, the other 3 are acidic residues that are present in 2 conserved sequence motifs. These may have a metal binding function

    The TOPRIM domain may form a shallow groove on these molecules and play a role in the binding of double-helical DNA/RNA hybrids.

    Proteins where this domain is known:
    PF13_0251    PF14_0112   


    SM00498 - FH2 (Smart link)

    Interpro entry IPR003104 : Actin-binding FH2 and DRF autoregulatory (Interpro link)

    Interpro description:

    Formin homology (FH) proteins play a crucial role in the reorganisation of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other, and may also act to inhibit actin polymerisation. The FH3 domain is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).

    This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains.At its C terminus is the DRF autoregulatory region.

    Proteins where this domain is known:
    PFE1545c    PFL0925w   


    SM00500 - SFM (Smart link)

    Interpro entry IPR003648 : Splicing factor motif (Interpro link)

    Interpro description:
    The splicing factor motif is present in splicing factors including Prp18 and Pr04. In yeast, Pr04 is a U4/U6 small nuclear ribonucleoprotein involved in RNA splicing. It is required for the association of U4/U6 snRNP with U5 snRNP in an early step of spliceosome assembly.

    Proteins where this domain is known:
    MAL13P1.385   


    SM00503 - SynN (Smart link)

    Interpro entry IPR006011 : Syntaxin, N-terminal (Interpro link)

    Interpro description:

    Syntaxins A and B are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane. Syntaxins are a family of receptors for intracellular transport vesicles. Each target membrane may be identified by a specific member of the syntaxin family. Members of the syntaxin family have a size ranging from 30 Kd to 40 Kd; a C-terminal extremity which is highly hydrophobic and anchors the protein on the cytoplasmic surface of cellular membranes; a central, well conserved region, which seems to be in a coiled-coil conformation.

    Proteins where this domain is known:
    PFB0480w   


    SM00504 - Ubox (Smart link)

    Interpro entry IPR003613 : U box (Interpro link)

    Interpro description:

    Quality control of intracellular proteins is essential for cellular homeostasis. Molecular chaperones recognise and contribute to the refolding of misfolded or unfolded proteins, whereas the ubiquitin-proteasome system mediates the degradation of such abnormal proteins. Ubiquitin-protein ligases (E3s) determine the substrate specificity for ubiquitylation and have been classified into HECT and RING-finger families. More recently, however, U-box proteins, which contain a domain (the U box) of about 70 amino acids that is conserved from yeast to humans, have been identified as a new type of E3.

    Members of the U-box family of proteins constitute a class of ubiquitin-protein ligases (E3s) distinct from the HECT-type and RING finger-containing E3 families. Using yeast two-hybrid technology, all mammalian U-box proteins have been reported to interact with molecular chaperones or co-chaperones, including Hsp90, Hsp70, DnaJc7, EKN1, CRN, and VCP. This suggests that the function of U box-type E3s is to mediate the degradation of unfolded or misfolded proteins in conjunction with molecular chaperones as receptors that recognise such abnormal proteins.

    Unlike the RING finger domain that is stabilised by Zn2+ ions coordinated by the cysteines and a histidine, the U-box scaffold is probably stabilised by a system of salt-bridges and hydrogen bonds. The charged and polar residues that participate in this network of bonds are more strongly conserved in the U-box proteins than in classic RING fingers, which supports their role in maintaining the stability of the U box. Thus, the U box appears to have evolved from a RING finger domain by appropriation of a new set of residues required to stabilise its structure, concomitant with the loss of the original, metal-chelating residues.

    Proteins where this domain is known:
    PF07_0026    PF08_0020    PFC0365w   


    SM00507 - HNHc (Smart link)

    Interpro entry IPR003615 : (Interpro link)

    Interpro description:
    This domain is found in HNH family of nucleases that includes yeast intron 1 protein, MutS, and bacterial colicins and pyocins. They are found in bacteria, viruses and eukaryotes.

    Proteins where this domain is known:
    PFF0225w   


    SM00508 - PostSET (Smart link)

    Interpro entry IPR003616 : (Interpro link)

    Interpro description:

    This region is found in a number of histone lysine methyltransferases (HMTase), C-terminal to the SET domain; it is generally described as the post-SET domain.

    Histone lysine methylation is part of the histone code that regulated chromatin function and epigenetic control of gene function. Histone lysine methyltransferases (HMTase) differ both in their substrate specificity for the various acceptor lysines as well as in their product specificity for the number of methyl groups (one, two, or three) they transfer. With just one exception, the HMTases belong to SET family that can be classified according to the sequences surrounding the SET domain. Structural studies on the human SET7/9, a mono-methylase, have revealed the molecular basis for the specificity of the enzyme for the histone-target and the roles of the invariant residues in the SET domain in determining the methylation specificities.

    The pre-SET domain, as found in the SUV39 SET family, contains nine invariant cysteine residues that are grouped into two segments separated by a region of variable length. These 9 cysteines coordinate 3 zinc ions to form to form a triangular cluster, where each of the zinc ions is coordinated by 4 four cysteines to give a tetrahedral configuration. The function of this domain is structural, holding together 2 long segments of random coils.

    The C-terminal region including the post-SET domain is disordered when not interacting with a histone tail and in the absence of zinc. The three conserved cysteines in the post-SET domain form a zinc-binding site when coupled to a fourth conserved cysteine in the knot-like structure close to the SET domain active site. The structured post-SET region brings in the C-terminal residues that participate in S-adenosylmethine-binding and histone tail interactions. The three conserved cysteine residues are essential for HMTase activity, as replacement with serine abolishes HMTase activity.

    Proteins where this domain is known:
    PFF1440w   


    SM00512 - Skp1 (Smart link)

    Interpro entry IPR001232 : SKP1 component (Interpro link)

    Interpro description:

    SKP1 (together with SKP2) was identified as an essential component of the cyclin A-CDK2 S phase kinase complex. It was found to bind several F-box containing proteins (e.g., Cdc4, Skp2, cyclin F) and to be involved in the ubiquitin protein degradation pathway. A yeast homologue of SKP1 (P52286) was identified in the centromere bound kinetochore complex and is also involved in the ubiquitin pathway. In Dictyostelium discoideum (Slime mold) FP21 was shown to be glycosylated in the cytosol and has homology to SKP1.

    Proteins where this domain is known:
    MAL13P1.337   


    SM00513 - SAP (Smart link)

    Interpro entry IPR003034 : DNA-binding SAP (Interpro link)

    Interpro description:

    The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA binding domain found in diverse nuclear proteins involved in chromosomal organization, including in apoptosis. In yeast, SAP is found in the most distal N-terminal region of E3 SUMO-protein ligase SIZ1, where it is involved in nuclear localization.

    Proteins where this domain is known:
    MAL13P1.302   


    SM00516 - SEC14 (Smart link)

    Interpro entry IPR001251 : (Interpro link)

    Interpro description:
    This entry defines the C-terminal of various retinaldehyde/retinal-binding proteins that may be functional components of the visual cycle. Cellular retinaldehyde-binding protein (CRALBP) carries 11-cis-retinol or 11-cis-retinaldehyde as endogenous ligands and may function as a substrate carrier protein that modulates interaction of these retinoids with visual cycle enzymes. The multidomain protein Trio binds the LAR transmembrane tyrosine phosphatase, contains a protein kinase domain, and has separate rac-specific and rho-specific guanine nucleotide exchange factor domains. Trio is a multifunctional protein that integrates and amplifies signals involved in coordinating actin remodeling, which is necessary for cell migration and growth.

    Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.

    Proteins where this domain is known:
    PF11_0287   


    SM00517 - PolyA (Smart link)

    Interpro entry IPR002004 : Polyadenylate-binding protein/Hyperplastic disc protein (Interpro link)

    Interpro description:

    The polyadenylate-binding protein (PABP) has a conserved C-terminal domain (PABC), which is also found in the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains. PABP recognises the 3' mRNA poly(A) tail and plays an essential role in eukaryotic translation initiation and mRNA stabilisation/degradation. PABC domains of PABP are peptide-binding domains that mediate PABP homo-oligomerisation and protein-protein interactions. In mammals, the PABC domain of PABP functions to recruit several different translation factors to the mRNA poly(A) tail.

    Proteins where this domain is known:
    PFL1170w   


    SM00518 - AP2Ec (Smart link)

    Interpro entry IPR001719 : AP endonuclease, family 2 (Interpro link)

    Interpro description:

    DNA damaging agents such as the anti-tumour drugs bleomycin and neocarzinostatin or those that generate oxygen radicals produce a variety of lesions in DNA. Amongst these is base-loss which forms apurinic/apyrimidinic (AP) sites or strand breaks with atypical 3' termini. DNA repair at the AP sites is initiated by specific endonuclease cleavage of the phosphodiester backbone. Such endonucleases are also generally capable of removing blocking groups from the 3' terminus of DNA strand breaks.

    AP endonucleases can be classified into two families based on sequence similarity. Family 2 groups the enzymes listed below.

    Escherichia coli endonuclease IV and its S. cerevisiae homologue Apn1 have been shown to be transition metalloproteins that require zinc and manganese for activity.

    Proteins where this domain is known:
    PF13_0176   


    SM00525 - FES (Smart link)

    Interpro entry IPR003651 : Endonuclease III-like, iron-sulphur cluster loop (Interpro link)

    Interpro description:

    Endonuclease III is a DNA repair enzyme which removes a number of damaged pyrimidines from DNA via its glycosylase activity and also cleaves the phosphodiester backbone at apurinic / apyrimidinic sites via a beta-elimination mechanism. The structurally related DNA glycosylase MutY recognises and excises the mutational intermediate 8-oxoguanine-adenine mispair. The 3-D structures of Escherichia coli endonuclease III and catalytic domain of MutY have been determined. The structures contain two all-alpha domains: a sequence-continuous, six-helix domain (residues 22-132) and a Greek-key, four-helix domain formed by one N-terminal and three C-terminal helices (residues 1-21 and 133-211) together with the [Fe4S4] cluster. The cluster is bound entirely within the C-terminal loop by four cysteine residues with a ligation pattern Cys-(Xaa)6-Cys-(Xaa)2-Cys-(Xaa)5-Cys which is distinct from all other known Fe4S4 proteins. This structural motif is referred to as a [Fe4S4] cluster loop (FCL). Two DNA-binding motifs have been proposed, one at either end of the interdomain groove: the helix-hairpin-helix (HhH) and FCL motifs. The primary role of the iron-sulphur cluster appears to involve positioning conserved basic residues for interaction with the DNA phosphate backbone by forming the loop of the FCL motif.

    The iron-sulphur cluster loop (FCL) is also found in DNA-(apurinic or apyrimidinic site) lyase, a subfamily of endonuclease III. The enzyme has both apurinic and apyrimidinic endonuclease activity and a DNA N-glycosylase activity. It cuts damaged DNA at cytosines, thymines and guanines, and acts on the damaged strand 5' of the damaged site. The enzyme binds a 4Fe-4S cluster which is not important for the catalytic activity, but is probably involved in the alignment of the enzyme along the DNA strand.

    Proteins where this domain is known:
    PFF0715c   


    SM00530 - HTH_XRE (Smart link)

    Interpro entry IPR001387 : Helix-turn-helix type 3 (Interpro link)

    Interpro description:

    This is large family of DNA binding helix-turn helix proteins that include a bacterial plasmid copy control protein, bacterial methylases, various bacteriophage transcription control proteins and a vegetative specific protein from Dictyostelium discoideum (Slime mould).

    Proteins where this domain is known:
    PF11_0293   


    SM00533 - MUTSd (Smart link)

    Interpro entry IPR007696 : DNA mismatch repair protein MutS, core (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA.

    Proteins where this domain is known:
    MAL7P1.206    PF14_0254    PFE0270c   


    SM00534 - MUTSac (Smart link)

    Interpro entry IPR000432 : DNA mismatch repair protein MutS, C-terminal (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the C-terminal region found in proteins in the MutS family of DNA mismatch repair proteins. The C-terminal region of MutS is comprised of the ATPase domain and the HTH (helix-turn-helix) domain, the latter being involved in dimer contacts. Yeast MSH3, bacterial proteins involved in DNA mismatch repair, and the predicted protein product of the Rep-3 gene of mouse share extensive sequence similarity. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein.

    Proteins where this domain is known:
    MAL7P1.206    PF14_0051    PF14_0254    PFE0270c   


    SM00543 - MIF4G (Smart link)

    Interpro entry IPR003890 : MIF4G-like, type 3 (Interpro link)

    Interpro description:

    This entry represents an MIF4G-like domain. MIF4G domains share a common structure but can differ in sequence. This entry is designated "type 3", and is found in nuclear cap-binding proteins, eIF4G, and UPF2.

    The MIF4G domain is a structural motif with an ARM (Armadillo) repeat-type fold, consisting of a 2-layer alpha/alpha right-handed superhelix. Proteins usually contain two or more structurally similar MIF4G domains connected by unstructured linkers. MIF4G domains are found in several proteins involved in RNA metabolism, including eIF4G (eukaryotic initiation factor 4-gamma), eIF-2b (translation initiation factor), UPF2 (regulator of nonsense transcripts 2), and nuclear cap-binding proteins (CBP80, CBC1, NCBP1), although the sequence identity between them may be low.

    The nuclear cap-binding complex (CBC) is a heterodimer. Human CBC consists of a large CBP80 subunit and a small CBP20 subunit, the latter being critical for cap binding. CBP80 contains three MIF4G domains connected with long linkers, while CBP20 has an RNP (ribonucleoprotein)-type domain that associates with domains 2 and 3 of CBP80. The complex binds to 5'-cap of eukaryotic RNA polymerase II transcripts, such as mRNA and U snRNA. The binding is important for several mRNA nuclear maturation steps and for nonsense-mediated decay. It is also essential for nuclear export of U snRNAs in metazoans.

    Eukaryotic translation initiation factor 4 gamma (eIF4G) plays a critical role in protein expression, and is at the centre of a complex regulatory network. Together with the cap-binding protein eIF4E, it recruits the small ribosomal subunit to the 5'-end of mRNA and promotes the assembly of a functional translation initiation complex, which scans along the mRNA to the translation start codon. The activity of eIF4G in translation initiation could be regulated through intra- and inter-protein interactions involving the ARM repeats. In eIF4G, the MIF4G domain binds eIF4A, eIF3, RNA and DNA.

    Nonsense-mediated mRNA decay (NMD) in eukaryotes involves UPF1, UPF2 and UPF3 to accelerate the decay rate of two unique classes of transcripts: (1) nonsense mRNAs that arise through errors in gene expression, and (2) naturally occurring transcripts that lack coding errors but have built-in features that target them for accelerated decay (error-free mRNAs). NMD can trigger decay during any round of translation and can target CBC-bound or eIF-4E-bound transcripts. UPF2 contains MIF4G domains, while UPF3 contains an RNP domain.

    Proteins where this domain is known:
    MAL13P1.63    PF11_0086    PFL1855w   


    SM00544 - MA3 (Smart link)

    Interpro entry IPR003891 : (Interpro link)

    Interpro description:

    This entry represents the MI domain (after MA-3 and eIF4G), it is a protein-protein interaction module of ~130 amino acids. It appears in several translation factors and is found in:

    The MI domain consists of seven alpha-helices, which pack into a globular form. The packing arrangement consists of repeating pairs of antiparallel helices packed one upon the other such that a superhelical axis is generated perpendicular to the alpha-helical axes.

    The MI domain has also been named MA3 domain.

    Proteins where this domain is known:
    PF14_0546    PFL1855w   


    SM00547 - ZnF_RBZ (Smart link)

    Interpro entry IPR001876 : Zinc finger, RanBP2-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents the zinc finger domain found in RanBP2 proteins. Ran is an evolutionary conserved member of the Ras superfamily that regulates all receptor-mediated transport between the nucleus and the cytoplasm. Ran binding protein 2 (RanBP2) is a 358-kDa nucleoporin located on the cytoplasmic side of the nuclear pore complex which plays a role in nuclear protein import. RanBP2 contains multiple zinc fingers which mediate binding to RanGDP.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PF13_0099    PF13_0278    PFD0405c   


    SM00553 - SEP (Smart link)

    Interpro entry IPR012989 : (Interpro link)

    Interpro description:

    The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain.

    This domain has a 2-layer beta(3)-alpha(2)-beta fold, and is present in a number of other proteins as well, including FAF1 (Fas-associated factor 1) and undulin 2. Many of these proteins also contain the UBX domain C-terminal to the FAF domain. This domain is found in many eukaryotic proteins.

    Proteins where this domain is known:
    MAL8P1.122   


    SM00554 - FAS1 (Smart link)

    Interpro entry IPR000782 : (Interpro link)

    Interpro description:

    The FAS1 (fasciclin-like) domain is an extracellular module of about 140 amino acid residues. It has been suggested that the FAS1 domain represents an ancient cell adhesion domain common to plants and animals; related FAS1 domains are also found in bacteria.

    The crystal structure of FAS1 domains 3 and 4 of fasciclin I from Drosophila melanogaster (Fruit fly) has been determined, revealing a novel domain fold consisting of a seven-stranded beta wedge and at least five alpha helices; two well-ordered N-acetylglucosamine groups attached to a conserved asparagine are located in the interface region between the two FAS1 domains. Fasciclin I is an insect neural cell adhesion molecule involved in axonal guidance that is attached to the membrane by a GPI-anchored protein.

    FAS1 domains are present in many secreted and membrane-anchored proteins. These proteins are usually GPI anchored and consist of: (i) a single FAS1 domain, (ii) a tandem array of FAS1 domains, or (iii) FAS1 domain(s) interspersed with other domains.

    Proteins known to contain a FAS1 domain include:

    The FAS1 domains of both human periostin and BIgH3 proteins were found to contain vitamin K-dependent gamma-carboxyglutamate residues. Gamma-carboxyglutamate residues are more commonly associated with GLA domains, where they occur through post-translational modification catalysed by the vitamin K-dependent enzyme gamma-glutamylcarboxylase.

    Proteins where this domain is known:
    PF14_0446   


    SM00558 - JmjC (Smart link)

    Interpro entry IPR003347 : (Interpro link)

    Interpro description:

    This entry contains:

    Proteins where this domain is known:
    MAL8P1.111   


    SM00562 - NDK (Smart link)

    Interpro entry IPR001564 : Nucleoside diphosphate kinase, core (Interpro link)

    Interpro description:

    Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.

    In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.

    NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.

    NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.

    Proteins where this domain is known:
    PF13_0349   


    SM00563 - PlsC (Smart link)

    Interpro entry IPR002123 : Phospholipid/glycerol acyltransferase (Interpro link)

    Interpro description:

    This family contains acyltransferases involved in phospholipid biosynthesis and other proteins of unknown function. This domain is found in tafazzins, defects in which are the cause of Barth syndrome; a severe inherited disorder which is often fatal in childhood and is characterised by cardiac and skeletal abnormalities. Phospholipid/glycerol acyltransferase is not found in the viruses or the archaea and is under represented in the bacteria. Bacterial glycerol-phosphate acyltransferases are involved in membrane biogenesis since they use fatty acid chains to form the first membrane phospholipids.

    Proteins where this domain is known:
    PF14_0421    PFI0695c    PFL0620c   


    SM00567 - EZ_HEAT (Smart link)

    Interpro entry IPR004155 : (Interpro link)

    Interpro description:

    These proteins contain a short bi-helical repeat that is related to HEAT. Cyanobacteria and red algae harvest light energy using macromolecular complexes known as phycobilisomes (PBS), peripherally attached to the photosynthetic membrane. The major components of PBS are the phycobiliproteins. These heterodimeric proteins are covalently attached to phycobilins: open-chain tetrapyrrole chromophores, which function as the photosynthetic light-harvesting pigments. Phycobiliproteins differ in sequence and in the nature and number of attached phycobilins to each of their subunits. These proteins include the lyase enzymes that specifically attach particular phycobilins to apophycobiliprotein subunits. The most comprehensively studied of these is the CpcE/Flyasewhich attaches phycocyanobilin (PCB) to the alpha subunit of apophycocyanin. Similarly, MpeU/V attaches phycoerythrobilin to phycoerythrin II, while CpeY/Z is thought to be involved in phycoerythrobilin (PEB) attachment to phycoerythrin (PE) I (PEs I and II differ in sequence and in the number of attached molecules of PEB: PE I has five, PE II has six).

    All the reactions of the above lyases involve an apoprotein cysteine SH addition to a terminal delta 3,3'-double bond. Such a reaction is not possible in the case of phycoviolobilin (PVB), the phycobilin of alpha-phycoerythrocyanin (alpha-PEC). It is thought that in this case, PCB, not PVB, is first added to apo-alpha-PEC, and is then isomerized to PVB. The addition reaction has been shown to occur in the presence of either of the components of alpha-PEC-PVB lyase PecE or PecF (or both). The isomerisation reaction occurs only when both PecE and PecF components are present, i.e. the PecE/F phycobiliprotein lyase is also a phycobilin isomerase. Another member of this family is the NblB protein, whose similarity to the phycobiliprotein lyases was previously noted. This constitutively expressed protein is not known to have any lyase activity. It is thought to be involved in the coordination of PBS degradation with environmental nutrient limitation. It has been suggested that the similarity of NblB to the phycobiliprotein lyases is due to the ability to bind tetrapyrrole phycobilins via the common repeated motif.

    Proteins where this domain is known:
    PF13_0013    PF14_0632   


    SM00568 - GRAM (Smart link)

    Interpro entry IPR004182 : (Interpro link)

    Interpro description:

    The GRAM domain is found in glucosyltransferases, myotubularins and other putative membrane-associated proteins. It is normally about 70 amino acids in length. It is thought to be an intracellular protein-binding or lipid-binding signalling domain, which has an important function in membrane-associated processes. Mutations in the GRAM domain of myotubularins cause a muscle disease, which suggests that the domain is essential for the full function of the enzyme. Myotubularin-related proteins are a large subfamily of protein tyrosine phosphatases (PTPs) that dephosphorylate D3-phosphorylated inositol lipids.

    Proteins where this domain is known:
    MAL8P1.143   


    SM00577 - CPDc (Smart link)

    Interpro entry IPR004274 : (Interpro link)

    Interpro description:
    The function of this domain is unclear. It is found in proteins of diverse function including phosphatases some of which may be active in active in ternary elongation complexes and a number of NLI interacting factors. In the phospatases this domain is often present N-terminal to the BRCT domain.

    Proteins where this domain is known:
    MAL13P1.275    PF07_0110    PF10_0124    PFE0795c   


    SM00579 - FBD (Smart link)

    Interpro entry IPR006566 : (Interpro link)

    Interpro description:

    This domain of unknown function is found in FBox and BRCT domain containing plant proteins.

    Proteins where this domain is known:
    PF14_0257   


    SM00580 - PUG (Smart link)

    Interpro entry IPR006567 : (Interpro link)

    Interpro description:

    PUG is a domain in protein kinases, N-glycanases and other nuclear proteins found in eukaryotes.

    Proteins where this domain is known:
    PF08_0080    PF14_0128    PFI1525w   


    SM00581 - PSP (Smart link)

    Interpro entry IPR006568 : (Interpro link)

    Interpro description:

    PSP is a proline-rich domain of unknown function found in spliceosome associated proteins.

    Proteins where this domain is known:
    PF14_0587   


    SM00582 - RPR (Smart link)

    Interpro entry IPR006569 : (Interpro link)

    Interpro description:

    RPR is a domain of unknown function present in proteins which are involved in regulation of nuclear pre-mRNA.

    Proteins where this domain is known:
    PF14_0028   


    SM00584 - TLDc (Smart link)

    Interpro entry IPR006571 : (Interpro link)

    Interpro description:

    TLDc is a domain of unknown function, restricted to eukaryotes, and commonly found in TBC and LysM domain containing proteins.

    Proteins where this domain is known:
    MAL13P1.395    PF14_0647    PFI0970c   


    SM00591 - RWD (Smart link)

    Interpro entry IPR006575 : (Interpro link)

    Interpro description:

    The RWD eukaryotic domain is found in RING finger and WD repeat containing proteins and DEXDc-like helicase subfamily related to the ubiquitin-conjugating enzymes domain.

    Proteins where this domain is known:
    MAL8P1.41   


    SM00602 - VPS10 (Smart link)

    Interpro entry IPR006581 : VPS10 (Interpro link)

    Interpro description:

    Yeast Vps10p is a receptor for sorting and transport of the soluble vacuolar hydrolase carboxypeptidase Y to the lysosome-like vacuole.. In mammalian cells, proteins containing this domain are involved in the transport of lipoproteins and sorting of endosomal proteins. They may also act as receptors for some neuropeptides.

    The N terminus of murine brain SorCS contains two putative cleavage sites for the convertase furin which mark the beginning of the VPS10 domain, which is followed by a module of imperfect leucine-rich repeats and a transmembrane domain. The short intracellular C-terminus contains consensus signals for rapid internalization. The identified putative binding motifs for SH2 and SH3 domains are unique in the family of VPS10 domain receptors. SorCS is predominantly expressed in brain, but also in heart, liver, and kidney. SorCS transcripts detected by in situ hybridization in the murine central nervous system point to a neuronal expression.

    Proteins where this domain is known:
    PF14_0493   


    SM00611 - SEC63 (Smart link)

    Interpro entry IPR018127 : (Interpro link)

    Interpro description:

    This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases.

    Proteins where this domain is known:
    PFD1060w   


    SM00612 - Kelch (Smart link)

    Interpro entry IPR006652 : (Interpro link)

    Interpro description:

    Kelch is a 50-residue motif, named after the Drosophila mutant in which it was first identified. This sequence motif represents one beta-sheet blade, and several of these repeats can associate to form a beta-propeller. For instance, the motif appears 6 times in Drosophila egg-chamber regulatory protein, creating a 6-bladed beta-propeller. The motif is also found in mouse protein MIPP and in a number of poxviruses. In addition, kelch repeats have been recognised in alpha- and beta-scruin, and in galactose oxidase from the fungus Dactylium dendroides. The structure of galactose oxidase reveals that the repeated sequence corresponds to a 4-stranded anti-parallel beta-sheet motif that forms the repeat unit in a super-barrel structural fold.

    The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.

    This entry represents a type of kelch sequence motif that comprises one beta-sheet blade.

    Proteins where this domain is known:
    MAL7P1.137    PF13_0238   


    SM00636 - Glyco_18 (Smart link)

    Interpro entry IPR011583 : Chitinase II (Interpro link)

    Interpro description:

    O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.

    Members of this family belong to the chitinase class II group which includes chitinase, chitodextrinase and the killer toxin of Kluyveromyces lactis (Yeast) (Candida sphaerica) and all belong to glycoside hydrolase, family 18 The chitinases hydrolyse chitin oligosaccharides.

    Proteins where this domain is known:
    PFL2510w   


    SM00645 - Pept_C1 (Smart link)

    Interpro entry IPR000668 : Peptidase C1A, papain C-terminal (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of proteins belong to the peptidase family C1, sub-family C1A (papain family, clan CA). It includes proteins classed as non-peptidase homologs. These are have either been shown experimentally to lack peptidase activity or lack one or more of the active site residues.

    The papain family has a wide variety of activities, including broad-range (papain) and narrow-range endo-peptidases, aminopeptidases, dipeptidyl peptidases and enzymes with both exo- and endo-peptidase activity. Members of the papain family are widespread, found in baculovirus, eubacteria, yeast, and practically all protozoa, plants and mammals. The proteins are typically lysosomal or secreted, and proteolytic cleavage of the propeptide is required for enzyme activation, although bleomycin hydrolase is cytosolic in fungi and mammals. Papain-like cysteine proteinases are essentially synthesised as inactive proenzymes (zymogens) with N-terminal propeptide regions. The activation process of these enzymes includes the removal of propeptide regions. The propeptide regions serve a variety of functions in vivo and in vitro. The pro-region is required for the proper folding of the newly synthesised enzyme, the inactivation of the peptidase domain and stabilisation of the enzyme against denaturing at neutral to alkaline pH conditions. Amino acid residues within the pro-region mediate their membrane association, and play a role in the transport of the proenzyme to lysosomes. Among the most notable features of propeptides is their ability to inhibit the activity of their cognate enzymes and that certain propeptides exhibit high selectivity for inhibition of the peptidases from which they originate.

    The catalytic residues of papain are Cys-25 and His-159, other important residues being Gln-19, which helps form the 'oxyanion hole', and Asn-175, which orientates the imidazole ring of His-159.

    Proteins where this domain is known:
    PF11_0161    PF11_0162    PF11_0165    PF11_0174    PF14_0553    PFB0325c    PFB0330c    PFB0335c    PFB0340c    PFB0345c    PFB0350c    PFB0355c    PFB0360c    PFD0230c    PFI0135c    PFL2290w   


    SM00647 - IBR (Smart link)

    Interpro entry IPR002867 : Zinc finger, C6HC-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    This entry represents a cysteine-rich (C6HC) zinc finger domain that is present in Triad1, and which is conserved in other proteins encoded by various eukaryotes. The C6HC consensus pattern is:

    The C6HC zinc finger motif is the fourth family member of the zinc-binding RING, LIM, and LAP/PHD fingers. Strikingly, in most of the proteins the C6HC domain is flanked by two RING finger structures The novel C6HC motif has been called DRIL (double RING finger linked). The strong conservation of the larger tripartite TRIAD (twoRING fingers and DRIL) structure indicates that the three subdomains are functionally linked and identifies a novel class of proteins.

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    PFC0175w   


    SM00648 - SWAP (Smart link)

    Interpro entry IPR000061 : SWAP/Surp (Interpro link)

    Interpro description:
    SWAP is derived from the Suppressor-of-White-APricot splicing regulator from Drosophila melanogaster. The domain is found in regulators responsible for pervasive, nonsex-specific alternative pre-mRNA splicing characteristics and has been found in splicing regulatory proteins. These ancient, conserved SWAP proteins share a colinearly arrayed series of novel sequence motifs.

    Proteins where this domain is known:
    PF14_0028    PF14_0713   


    SM00649 - RL11 (Smart link)

    Interpro entry IPR000911 : Ribosomal protein L11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.

    Proteins where this domain is known:
    PF11_0113    PFE0850c   


    SM00650 - rADc (Smart link)

    Interpro entry IPR001737 : Ribosomal RNA adenine methylase transferase (Interpro link)

    Interpro description:

    This family of proteins include rRNA adenine dimethylases (e.g. KsgA) and the Erythromycin resistance methylases (Erm).

    The bacterial enzyme KsgA catalyses the transfer of a total of four methyl groups from S-adenosyl-l-methionine (S-AdoMet) to two adjacent adenosine bases in 16S rRNA. This enzyme and the resulting modified adenosine bases appear to be conserved in all species of eubacteria, eukaryotes, and archaea, and in eukaryotic organelles. Bacterial resistance to the aminoglycoside antibiotic kasugamycin involves inactivation of KsgA and resulting loss of the dimethylations, with modest consequences to the overall fitness of the organism. In contrast, the yeast ortholog, Dim1, is essential. In Saccharomyces cerevisiae (Baker's yeast), and presumably in other eukaryotes, the enzyme performs a vital role in pre-rRNA processing in addition to its methylating activity. The best conserved region in these enzymes is located in the N-terminal section and corresponds to a region that is probably involved in S-adenosyl methionine (SAM) binding domain.

    The crystal structure of KsgA from Escherichia coli has been solved to a resolution of 2.1A. It bears a strong similarity to the crystal structure of ErmC' from Bacillus stearothermophilus and a lesser similarity to the yeast mitochondrial transcription factor, sc-mtTFB.

    The Erm family of RNA methyltransferases, which methylate a single adenosine base in 23S rRNA confer resistance to the MLS-B group of antibiotics. Despite their sequence similarity, the two enzyme families have strikingly different levels of regulation that remain to be elucidated. Other orthologs, of this family include the yeast and Homo sapiens (Human) mitochondrial transcription factors (MTF1 and h-mtTFB respectively), which are nuclear encoded. Human-mtTFB is able to stimulate transcription in vitro independently of its S-adenosylmethionine binding and rRNA methyltransferase activity.

    Proteins where this domain is known:
    PF14_0156    PFL2395c   


    SM00651 - Sm (Smart link)

    Interpro entry IPR006649 : (Interpro link)

    Interpro description:

    This family is found in Lsm (like-Sm) proteins, which have a core structure consisting of an open beta-barrel with an SH3-like topology.

    Lsm (like-Sm) proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. Most snRNPs consist of seven Sm proteins (B/B', D1, D2, D3, E, F and G) arranged in a ring on a uridine-rich sequence (Sm site), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). All Sm proteins contain a common sequence motif in two segments, Sm1 and Sm2, separated by a short variable linker. Other snRNPs, such as U7 snRNP, can contain different Lsm proteins.

    Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins. Archaeal Lsm proteins are likely to represent the ancestral Lsm domain.

    Proteins where this domain is known:
    MAL13P1.253    MAL8P1.48    MAL8P1.9    PF08_0049    PF11_0255    PF11_0266    PF11_0280    PF11_0524    PF13_0142    PF14_0146    PF14_0411    PFB0865w    PFE1020w    PFI0475w    PFL0460w   


    SM00652 - eIF1a (Smart link)

    Interpro entry IPR001253 : Translation initiation factor 1A (eIF-1A) (Interpro link)

    Interpro description:

    Eukaryotic translation initiation factor A (eIF-1A) (formerly known as eiF-4C) is a protein that seems to be required for maximal rate of protein biosynthesis. It enhances ribosome dissociation into subunits and stabilizes the binding of the initiator Met-tRNA to 40S ribosomal subunits. The archaea possess an eIF-1A homolog.

    Proteins where this domain is known:
    PF11_0447   


    SM00654 - eIF6 (Smart link)

    Interpro entry IPR002769 : Translation initiation factor IF6 (Interpro link)

    Interpro description:

    This family includes eukaryotic translation initiation factor 6 (eIF6) as well as presumed archaeal homologues.

    The assembly of 80S ribosomes requires joining of the 40S and 60S subunits, which is triggered by the formation of an initiation complex on the 40S subunit. This event is rate-limiting for translation, and depends on external stimuli and the status of the cell. Eukaryotic translation initiation factor 6 (eIF6) binds specifically to the free 60S ribosomal subunit and prevents its association with the 40S ribosomal subunit ribosomes. Furthermore, eIF6 interacts in the cytoplasm with RACK1, a receptor for activated protein kinase C (PKC). RACK1 is a major component of translating ribosomes, which harbour significant amounts of PKC. Loading 60S subunits with eIF6 caused a dose-dependent translational block and impairment of 80S formation, which are reversed by expression of RACK1 and stimulation of PKC in vivo and in vitro. PKC stimulation leads to eIF6 phosphorylation and its release, promoting 80S subunit formation. RACK1 provides a physical and functional link between PKC signalling and ribosome activation.

    Proteins where this domain is known:
    PF13_0178   


    SM00657 - RPOL4c (Smart link)

    Interpro entry IPR006590 : (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    A major role in the regulation of eukaryotic protein-coding genes is played by the gene-specific transcriptional regulators, which recruit the RNA polymerase II holoenzyme to the specific promoter. The Rpb4 and Rpb7 subunits of yeast RNA polymerase II form a heterodimeric complex essential for promoter-directed transcription initiation. The Rpb4-Rpb7 complex is not required for stable recruitment of polymerase to active preinitiation complexes, suggesting that Rpb4-Rpb7 mediates an essential step subsequent to promoter binding.

    This entry represents a domain present in DNA-directed RNA polymerase II subunit, Rpb4.

    Proteins where this domain is known:
    PFB0245c   


    SM00662 - RPOLD (Smart link)

    Interpro entry IPR011263 : DNA-directed RNA polymerase, RpoA/D/Rpb3-type (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    The core of the bacterial RNA polymerase (RNAP) consists of four subunits, two alpha, a beta and a beta', which are conserved from bacteria to mammals. The alpha subunit (RpoA) initiates RNAP assembly by dimerising to form a platform on which the beta subunits can interact, and plays a direct role in promoter recognition. In eukaryotes, RNA polymerase (RNAP) II is responsible for all mRNA synthesis. RNAP-II consists of 12 subunits, where subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the bacterial RpoA homodimer. Archaeal RNAP closely resembles eukaryotic RNAP-II, and is composed of 12 subunits, of which D and L form a heterodimer resembling the Rpb3/Rpb11 and RpoA/RpoA dimers.

    The bacterial RpoA, eukaryotic Rpb3 and archaeal D subunits share sequence and structural motifs, and can be placed into a single family. These subunits also have unique sequence motifs, especially at their C-terminal ends, which are involved in promoter specificity, for example the CTD of the bacterial RNAP alpha subunit.

    Proteins where this domain is known:
    PF11_0445    PF13_0040    PF14_0695    PFI1130c   


    SM00663 - RPOLA_N (Smart link)

    Interpro entry IPR006592 : RNA polymerase, N-terminal (Interpro link)

    Interpro description:

    The task of transcribing nuclear genes is shared between three RNA polymerases in eukaryotes: RNA polymerase (pol) I synthesizes the large rRNA, pol II synthesizes mRNA and pol III synthesizes tRNA and 5S rRNA. Pol I transcription is localised to discrete sites called nucleoli; these can be likened to ribosome factories, in which rRNA is synthesised by pol I in the fibrillar centres and then processed and assembled into ribosomes in the surrounding granular regions. Prokaryotes, in contrast, posses a single RNA polymerase, with transcription being controlled by the particular signam factor interacting with the catalytic core.

    This entry describes an N-terminal conserved region which can be found in the largest subunits of prokaryoptic and eukaryotic RNA polymerases.

    Proteins where this domain is known:
    PF13_0150    PFC0805w    PFE0465c   


    SM00667 - LisH (Smart link)

    Interpro entry IPR006594 : (Interpro link)

    Interpro description:

    The LisH motif is found in a large number of eukaryotic proteins, from metazoa, fungi and plants that have a wide range of functions. The recently solved structure of the LisH domain in the N-terminal region of LIS1 depicted it as a novel dimerization motif, and that other structural elements are likely to play an important role in dimerisation.

    A sequence motif, LisH, has been identified in the products of genes mutated in Miller-Dieker lissencephaly, Treacher Collins, oral-facial-digital type 1 and contiguous syndrome ocular albinism with late onset sensorineural deafness syndromes. An additional homologous motif was detected in a gene product fused to the fibroblast growth factor receptor type 1 in patients with an atypical stem cell myeloproliferative disorder. In total, over 100 eukaryotic intracellular proteins are shown to possess a LIS1 homology (LisH) motif, including several katanin p60 subunits, muskelin, tonneau, LEUNIG, Nopp140, aimless and numerous WD repeat-containing beta-propeller proteins.

    It is suggested that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The predicted secondary structure of LisH motifs, and their occurrence in homologues of Gbeta beta-propeller subunits, suggests that they are analogues of Ggamma subunits, and might associate with the periphery of beta-propeller domains.

    Proteins where this domain is known:
    MAL13P1.182    MAL13P1.54    PF13_0018    PF13_0164    PFE0930w    PFL0920c   


    SM00668 - CTLH (Smart link)

    Interpro entry IPR006595 : (Interpro link)

    Interpro description:

    The 33-residue LIS1 homology (LisH) motif is found in eukaryotic intracellular proteins involved in microtubule dynamics, cell migration, nucleokinesis and chromosome segregation. The LisH motif is likely to possess a conserved protein-binding function and it has been proposed that LisH motifs contribute to the regulation of microtubule dynamics, either by mediating dimerization, or else by binding cytoplasmic dynein heavy chain or microtubules directly. The LisH motif is found associated to other domains, such as WD-40 (see, SPRY, Kelch, AAA ATPase, RasGEF, or HEAT (see. The secondary structure of the LisH domain is predicted to be two alpha- helices.

    Some proteins known to contain a LisH motif are listed below:
  • Animal LIS1. It regulates cytoplasmic dynein function. In Homo sapiens (human) children with defects in LIS1 suffer from Miller-Dieker lissencephaly, a brain malformation that results in severe retardation, epilepsy and an early death.
  • Emericella nidulans (Aspergillus nidulans) nuclear migration protein nudF, the orthologue of LIS1.
  • Eukaryotic RanBPM, a Ran binding protein involved in microtubule nucleation.
  • Eukaryotic Nopp140, a nucleolar phosphoprotein.
  • Mammalian treacle, a nucleolar protein. In human, defects in treacle are the cause of Treacher Collins syndrome (TCS), an autosomal dominant disorder of craniofacial development.
  • Animal muskelin. It acts as a mediator of cell spreading and cytoskeletal responses to the extracellular matrix component thrombospondin 1.
  • Animal transducin beta-like 1 protein (TBL1).
  • Plant tonneau.
  • Arabidopsis thaliana LEUNIG, a putative transcriptional corepressor that regulates AGAMOUS expression during flower development.
  • Fungal aimless RasGEF.
  • Leishmania major katanin-like protein.
  • The C-terminal to LisH (CTLH) motif is a predicted alpha-helical sequence of unknown function that is found adjacent to the LisH motif in a number of these proteins but is absent in other (e.g. LIS1). The CTLH domain can also be found in the absence of the LisH motif, like in:

  • Arabidopsis thaliana (Mouse-ear cress) hypothetical protein MUD21.5.
  • Saccharomyces cerevisiae yeast protein RMD5.
  • Proteins where this domain is known:
    MAL13P1.182    PF13_0164   


    SM00670 - PINc (Smart link)

    Interpro entry IPR006596 : (Interpro link)

    Interpro description:

    PINc describes a large group of domains which are predicted to play a role in nucleotide-binding, potentially being found in RNases.

    PINc domains in nematode SMG-5 and yeast NMD4p are predicted to be involved in the posttranscriptional gene silencing pathway known as RNA interference (RNAi). In an early step in RNAi, the initiating dsRNA is cleaved into small interfering RNAs (siRNAs), 21-23 nucleotides long, by the enzyme Dicer. After processing by Dicer, siRNAs associate with a multicomponent complex called the RNA-induced silencing complex that recognises and cleaves the cognate message.

    Proteins where this domain is known:
    MAL8P1.67   


    SM00671 - SEL1 (Smart link)

    Interpro entry IPR006597 : (Interpro link)

    Interpro description:

    Sel1-like repeats are tetratricopeptide repeat sequences originally identified in a Caenorhabditis elegans receptor molecule which is a key negative regulator of the Notch pathway. Mammalian homologues have since been identified although these mainly pancreatic proteins have yet to have a function assigned.

    Proteins where this domain is known:
    PF14_0462    PFB0190c    PFC0550w   


    SM00673 - CARP (Smart link)

    Interpro entry IPR006599 : CARP motif (Interpro link)

    Interpro description:

    This entry represents the CARP motif, which occurs as a tandem repeat in the C-terminal of many cyclase-associated proteins (CAPs), as well as in tubulin binding cofactor C and the X-linked retinitis pigmentosa 2 protein (RP2). CARP-containing proteins appear to have a role in cell signalling.

    Cyclase-associated proteins (CAPs) are highly conserved monomeric actin-binding proteins present in a wide range of organisms including yeast, fly, plants, and mammals. CAPs are multifunctional proteins that contain several structural domains. CAP is involved in species-specific signalling pathways. Only yeast CAPs are involved in adenylate cyclase activation. The C-terminal domain of CAP proteins is responsible for G-actin-binding that regulates actin remodelling in response to cellular signals and is required for normal cellular morphology, cell division, growth and locomotion in eukaryotes.

    Tubulin binding cofactor C (or tubulin-specific chaperone C) (TBCC) is a folding cofactor that participates in tubulin biogenesis along with the other tubulin folding cofactors A (TBCA), B (TBCB), E (TBCE) and D (TBCD), as well as the GTP-binding protein Arl2.

    Retinitis pigmentosa (RP) comprises a large group of heterogeneous diseases that results in progressive retinal degeneration. Human X-linked retinitis pigmentosa 2 protein (RP2) consists of an N-terminal beta-helix and a C-terminal ferredoxin-like alpha/beta domain. RP2 is a specific effector protein of the GTP-binding protein Arl3. The Arl3 protein is a member of the Arf (ADP ribosylation factor) subfamily of Ras-related proteins. The beta-helix domain of RP2 is required for the RP2 interaction with Arl3. The CARP motif is found in the N-terminal beta-helix domain of RP2 proteins.

    Proteins where this domain is known:
    PFA0260c   


    SM00679 - CTNS (Smart link)

    Interpro entry IPR006603 : (Interpro link)

    Interpro description:

    This repeated motif of unknown function has been found between the transmembrane helices of cystinosin, yeast ERS1 and mannose-P-dolichol utilization defect 1. The positioning of this repeat suggests that it may be associated with the glycosylation machinery.

    Proteins where this domain is known:
    PF11_0361   


    SM00695 - DUSP (Smart link)

    Interpro entry IPR006615 : Peptidase C19, ubiquitin-specific peptidase, DUSP domain (Interpro link)

    Interpro description:

    Deubiquitinating enzymes (DUB) form a large family of cysteine protease that can deconjugate ubiquitin or ubiquitin-like proteins (see from ubiquitin-conjugated proteins. All DUBs contain a catalytic domain surrounded by one or more subdomains, some of which contribute to target recognition. The ~120-residue DUSP (domain present in ubiquitin-specific proteases) domain is one of these specific subdomains. Single or tandem DUSP domains are located both N- and C-terminal to the ubiquitin carboxyl-terminal hydrolase catalytic core domain (see.

    The DUSP domain displays a tripod-like AB3 fold with a three-helix bundle and a three-stranded anti-parallel beta-sheet resembling the legs and seat of the tripod (see PDB:1W6V). Conserved residues are predominantly involved in hydrophobic packing interactions within the three alpha-helices. The most conserved DUSP residues, forming the PGPI motif, are flanked by two long loops that vary both in length and sequence. The PGPI motif packs against the three-helix bundle and is highly ordered.

    The function of the DUSP domain is unknown but it may play a role in protein/protein interaction or substrate recognition. This domain is associated with ubiquitin carboxyl-terminal hydrolase family 2 (MEROPS peptidase family C19). They are a family 100 to 200 kDa peptides which includes the Ubp1 ubiquitin peptidase from yeast; others include:

    Proteins where this domain is known:
    PFI0225w   


    SM00698 - MORN (Smart link)

    Interpro entry IPR003409 : (Interpro link)

    Interpro description:
    The MORN (Membrane Occupation and Recognition Nexus) motif is found in multiple copies in several proteins including junctophilins (). The function of this motif is unknown.

    Proteins where this domain is known:
    MAL13P1.32    PF10_0101    PF10_0306    PF11_0307    PF14_0121    PF14_0243    PF14_0586    PFB0230c    PFB0520w    PFE0560c    PFE0735w    PFI1275w   


    SM00702 - P4Hc (Smart link)

    Interpro entry IPR006620 : Prolyl 4-hydroxylase, alpha subunit (Interpro link)

    Interpro description:

    Mammalian prolyl 4-hydroxylase alpha catalyses the posttranslational formation of 4- hydroxyproline in -xaa-pro-gly-sequences in collagens and other proteins. Prokaryotic enzymes might catalyse hydroxylation of antibiotic peptides. These are 2-oxoglutarate-dependent dioxygenases, requiring 2-oxoglutarate and dioxygen as cosubstrates and ferrous iron as a cofactor.

    Proteins where this domain is known:
    MAL8P1.8   


    SM00704 - ZnF_CDGSH (Smart link)

    Interpro entry IPR006622 : Iron sulphur domain-containing, CDGSH-type (Interpro link)

    Interpro description:

    This entry represents iron-sulphur domain containing proteins that have a CDGSH sequence motif (although the Ser residue can also be an Ala or Thr), and is found in proteins from a wide range of organisms with the exception of fungi. Proteins carrying this domain include ferredoxin-dependent glutamate synthase. CDGSH-type domains are also found in the iron-containing outer membrane protein mitoNEET. MitoNEET contains the conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H, a defining feature of CDGSH domains, and is likely involved in iron binding.

    Proteins where this domain is known:
    PFC0126c   


    SM00710 - PbH1 (Smart link)

    Interpro entry IPR006626 : (Interpro link)

    Interpro description:

    The tertiary structures of pectate lyases and rhamnogalacturonase A show a stack of parallel beta strands that are coiled into a large helix. Each coil of the helix represents a structural repeat that, in some homologues, can be recognised from sequence information alone. Conservation of asparagines might be connected with asparagine-ladders that contribute to the stability of the fold. Proteins containing these repeats most often are enzymes with polysaccharide substrates.

    Proteins where this domain is known:
    PF10_0213    PF14_0101    PFL1010c   


    SM00714 - LITAF (Smart link)

    Interpro entry IPR006629 : (Interpro link)

    Interpro description:

    This entry represents the LPS-induced tumour necrosis factor alpha factor (LITAF) is induced in mamalian cells following treatment with lipopolysaccharide. The LITAF domain is a possible membrane-associated motif which contains an N-terminal CXXC kuckle followed by a long (25 amino acid) hydrophobic region and a C-terminal (H)XCXXC knuckle. Both of these knuckles are highly characteristic of Zn2+ binding domains, and the N-terminal region of one LITAF domain-containing protein is thought to bind the intracellular molecule Nedd4 which suggests that the hydrophobic region does not span the membrane. It may instead insert into the membrane, bringing together the N- and C-terminal CXXC knuckles to form a compact Zn2+ binding structure.

    Proteins where this domain is known:
    PFD0595w   


    SM00717 - SANT (Smart link)

    Interpro entry IPR001005 : SANT, DNA-binding (Interpro link)

    Interpro description:

    The retroviral oncogene v-myb, and its cellular counterpart c-myb, encode nuclear DNA-binding proteins. These belong to the SANT domain family that specifically recognise the sequence YAAC(G/T)G. In myb, one of the most conserved regions consisting of three tandem repeats has been shown to be involved in DNA-binding.

    Proteins where this domain is known:
    PF10_0143    PF10_0327    PF11_0241    PF13_0088    PFF1385c    PFI1480w    PFL0290w    PFL0815w    PFL1215c   


    SM00724 - TLC (Smart link)

    Interpro entry IPR006634 : TRAM, LAG1 and CLN8 homology (Interpro link)

    Interpro description:

    TLC is a protein domain with at least 5 transmembrane alpha-helices. Lag1p and Lac1p are essential for acyl-CoA-dependent ceramide synthesis , TRAM is a subunit of the translocon and the CLN8 gene is mutated in Northern epilepsy syndrome. Proteins containing this domain may possess multiple functions such as lipid trafficking, metabolism, or sensing. Trh homologues possess additional homeobox domains.

    Proteins where this domain is known:
    PFE0405c   


    SM00726 - UIM (Smart link)

    Interpro entry IPR003903 : (Interpro link)

    Interpro description:

    The Ubiquitin Interacting Motif (UIM), or 'LALAL-motif', is a stretch of about 20 amino acid residues, which was first described in the 26S proteasome subunit PSD4/RPN-10 that is known to recognise ubiquitin. In addition, the UIM is found, often in tandem or triplet arrays, in a variety of proteins either involved in ubiquitination and ubiquitin metabolism, or known to interact with ubiquitin-like modifiers. Among the UIM proteins are two different subgroups of the UBP (ubiquitin carboxy-terminal hydrolase) family of deubiquitinating enzymes, one F-box protein, one family of HECT-containing ubiquitin-ligases (E3s) from plants, and several proteins containing ubiquitin-associated UBA and/or UBX domains. In most of these proteins, the UIM occurs in multiple copies and in association with other domains such as UBA, UBX, ENTH, EH, VHS, SH3, HECT, VWFA, EF-hand calcium-binding, WD-40, F-box, LIM, protein kinase, ankyrin, PX, phosphatidylinositol 3- and 4-kinase, C2, OTU, dnaJ, RING-finger or FYVE-finger. UIMs have been shown to bind ubiquitin and to serve as a specific targeting signal important for monoubiquitination. Thus, UIMs may have several functions in ubiquitin metabolism each of which may require different numbers of UIMs.

    The UIM is unlikely to form an independent folding domain. Instead, based on the spacing of the conserved residues, the motif probably forms a short alpha-helix that can be embedded into different protein folds. Some proteins known to contain an UIM are listed below:

    Proteins where this domain is known:
    PFF1485w    PFL1295w   


    SM00727 - STI1 (Smart link)

    Interpro entry IPR006636 : (Interpro link)

    Interpro description:

    This describes a heat shock chaperonin-binding motif found in the stress-inducible phosphoprotein STI1. Both N- and C-termini of STI1 are capable of binding heat shock proteins and the domain is found both singly and duplicated in other proteins.

    Proteins where this domain is known:
    PF14_0324    PFE1370w   


    SM00729 - Elp3 (Smart link)

    Interpro entry IPR006638 : (Interpro link)

    Interpro description:

    This domain is found in MoaA, NifB, PqqE, coproporphyrinogen III oxidase, biotin synthase and MiaB families, and includes a representative in the eukaryotic elongator subunit, Elp-3. Some members of the family are methyltransferases.

    Proteins where this domain is known:
    MAL13P1.220    PF14_0066    PFF1070c    PFL1345c   


    SM00730 - PSN (Smart link)

    Interpro entry IPR006639 : Peptidase A22, presenilin signal peptide (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    This group of aspartic peptidases belong to MEROPS peptidase family A22 (presenilin family, clan AD).

    SPP and potential eukaryotic homologs represent a family of aspartic proteases that promote intramembrane proteolysis to release biologically important peptides. Signal peptide peptidase (SPP) catalyses intramembrane proteolysis of some signal peptides after they have been cleaved from a preprotein. In humans, SPP activity is required to generate signal sequence-derived human lymphocyte antigen-E epitopes that are recognised by the immune system, and are required in the processing of the hepatitis C virus core protein.

    Proteins where this domain is known:
    PF14_0543   


    SM00731 - SprT (Smart link)

    Interpro entry IPR006640 : (Interpro link)

    Interpro description:

    This is a family of uncharacterised proteins which includes Escherichia coli SprT. The majority of members contain the metallopeptidase zinc binding signature which has a HExxH motif, however there is no evidence for them being metallopeptidases.

    Proteins where this domain is known:
    MAL13P1.191   


    SM00733 - Mterf (Smart link)

    Interpro entry IPR003690 : (Interpro link)

    Interpro description:

    This family currently contains one sequence of known function human mitochondrial transcription termination factor (mTERF), a multizipper protein but binds to DNA as a monomer, with evidence pointing to intramolecular leucine zipper interactions. The precursors contain a mitochondrial targeting sequence, and the mature mTERF exhibits three leucine zippers, of which one is bipartite, and two widely spaced basic domains. Both basic domains and the three leucine zipper motifs are necessary for DNA binding. The leucine zippers are not implicated in a dimerisation role as in other leucine zippers.

    The rest of the family consists of hypothetical proteins none of which have any functional information.

    Proteins where this domain is known:
    PF07_0113   


    SM00739 - KOW (Smart link)

    Interpro entry IPR005824 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and the bacterial transcription antitermination proteins NusG.

    Proteins where this domain is known:
    PF11_0065    PF13_0213    PF14_0579    PF14_0643    PFC0535w    PFF0245w    PFF0535c    PFL1150c   


    SM00744 - RINGv (Smart link)

    Interpro entry IPR011016 : Zinc finger, RING-CH-type (Interpro link)

    Interpro description:

    Zinc finger (Znf) domains are relatively small protein motifs that bind one or more zinc atoms, and which usually contain multiple finger-like protrusions that make tandem contacts with their target molecule. They were first identified as a DNA-binding motif in transcription factor TFIIIA from Xenopus laevis (African clawed frog), however they are now recognised to bind DNA, RNA, protein and/or lipid substrates. Their binding properties depend on the amino acid sequence of the finger domains and of the linker between fingers, as well as on the higher-order structures and the number of fingers. Znf domains are often found in clusters, where fingers can have different binding specificities. There are many superfamilies of Znf motifs, varying in both sequence and structure. They display considerable versatility in binding modes, even between members of the same class (e.g. some bind DNA, others protein), suggesting that Znf motifs are stable scaffolds that have evolved specialised functions. For example, Znf-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeleton organisation, epithelial development, cell adhesion, protein folding, chromatin remodelling and zinc sensing, to name but a few. Zinc-binding motifs are stable structures, and they rarely undergo conformational changes upon binding their target.

    (Note that in certain cases, some Znf domains have diverged such that they still maintain their core structure, but have lost their ability to bind zinc, using other means such as salt bridges or binding to other metals to stabilise the finger-like folds. These domains can show strong sequence identity to zinc-binding motifs, and may therefore be included in Znf entries).

    The RING finger is a well characterised zinc finger which coordinates two zinc atoms in a cross-braced manner (see. According to the pattern of cysteines and histidines three different subfamilies of RING finger can be defined. The classical RING finger (RING-HC) has a histidine at the fourth coordinating position and a cysteine at the fifth. In the RING-H2 variant, both the fourth and fifth positions are occupied by histidines. The RING-CH, which is very similar to the classical RING finger, differs from both of these variants in that it has a cys residue in the fourth position and a His in the fifth. Another difference between the RING-CH and the common RING variants is a somewhat longer peptide segment between the fourth and fifth zinc-coordinating residues. The RING-CH zinc finger has thus the same arrangement of cysteine and histidine (C4HC3) as the PHD zinc finger (see but it contains features (spacing between the cysteines and the histidine) characteristic of the genuine RING-finger (C3HC4). The RING-CH-type is an E3 ligase mainly found in proteins associated to membranes.

    The solution structure of the RING-CH-type zinc finger of the herpesvirus Mir1 protein has shown that it is an outlying relative of the cellular RING finger domain family, with its polypeptide backbone much more closely resembling that of RING domains than PHD domains. The only real difference between the classic and variant RING domains, other than the alteration of zinc ligands, is the loss of the small beta-sheet found in RING domains and the replacement of one strand of this sheet with a single turn of helix. Some proteins that contains a RING-CH-type zinc finger are listed below:

    More information about these proteins can be found at Protein of the Month: Zinc Fingers.

    Proteins where this domain is known:
    MAL13P1.405    PFI0470w   


    SM00751 - BSD (Smart link)

    Interpro entry IPR005607 : (Interpro link)

    Interpro description:

    The BSD domain is an about 60-residue long domain named after the BTF2-like transcription factors, Synapse-associated proteins and DOS2-like proteins in which it is found. Additionally, it is also found in several hypothetical proteins. The BSD domain occurs in one or two copies in a variety of species ranging from primal protozoan to human. It can be found associated with other domains such as the BTB domain (see or the U-box in multidomain proteins. The function of the BSD domain is yet unknown.

    Secondary structure prediction indicates the presence of three predicted alpha helices, which probably form a three-helical bundle in small domains. The third predicted helix contains neighbouring phenylalanine and tryptophan residues - less common amino acids that are invariant in all the BSD domains identified and that are the most striking sequence features of the domain.

    Some proteins known to contain one or two BSD domains are listed below:
  • Mammalian TFIIH basal transcription factor complex p62 subunit (GTF2H1).
  • Yeast RNA polymerase II transcription factor B 73 kDa subunit (TFB1), the homologue of BTF2.
  • Yeast DOS2 protein. It is involved in single-copy DNA replication and ubiquitination.
  • Drosophila synapse-associated protein SAP47.
  • Mammalian SYAP1.
  • Various Arabidopsis thaliana (Mouse-ear cress) hypothetical proteins.
  • Proteins where this domain is known:
    PFC1055w    PFD1095w    PFI0730w   


    SM00753 - PAM (Smart link)

    Interpro entry IPR013143 : (Interpro link)

    Interpro description:

    The PAM domain (PCI/PINT associated module) is found in a number of proteins that form multiprotein complexes, e.g. the Sac3-Thp1 complex, the regulatory subunit of the 26S proteasome and the COP-9 signalosome. The domain is present in a single copy and has an alpha-helical fold. It is thought to play a role in protein binding.

    Proteins where this domain is known:
    MAL13P1.190    PFB0240w   


    SM00757 - CRA (Smart link)

    Interpro entry IPR013144 : (Interpro link)

    Interpro description:

    The CR, or CT11-RanBPM, domain is a protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi).

    Proteins where this domain is known:
    MAL13P1.182    PF10_0140   


    SM00758 - PA14 (Smart link)

    Interpro entry IPR011658 : (Interpro link)

    Interpro description:

    The PA14 domain forms an insert in bacterial beta-glucosidases, other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium pre-spore cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding.

    Proteins where this domain is known:
    PF14_0491   


    SM00775 - LNS2 (Smart link)

    Interpro entry IPR013209 : (Interpro link)

    Interpro description:

    This domain is found in Saccharomyces cerevisiae (Baker's yeast) protein SMP2, proteins with an N-terminal lipin domain and phosphatidylinositol transfer proteins. SMP2 is involved in plasmid maintenance and respiration. Lipin proteins are involved in adipose tissue development and insulin resistance.

    Proteins where this domain is known:
    PFC0150w   


    SM00785 - AARP2CN (Smart link)

    Interpro entry IPR012948 : AARP2CN (Interpro link)

    Interpro description:

    This domain is the central domain of AARP2 (asparagine and aspartate rich protein 2). It is weakly similar to the GTP-binding domain of elongation factor TU. PfAARP2 is an antigen from Plasmodium falciparum of 150 kDa, which is encoded by a unique gene on chromosome 1. The central region of Pfaarp2 contains blocks of repetitions encoding asparagine and aspartate residues.

    Proteins where this domain is known:
    PF14_0494    PFA0330w   


    SM00788 - Adenylsucc_synt (Smart link)

    Interpro entry IPR001114 : Adenylosuccinate synthetase (Interpro link)

    Interpro description:

    Adenylosuccinate synthetase plays an important role in purine biosynthesis, by catalysing the GTP-dependent conversion of IMP and aspartic acid to AMP. Adenylosuccinate synthetase has been characterised from various sources ranging from Escherichia coli (gene purA) to vertebrate tissues. In vertebrates, two isozymes are present: one involved in purine biosynthesis and the other in the purine nucleotide cycle.

    The crystal structure of adenylosuccinate synthetase from E. coli reveals that the dominant structural element of each monomer of the homodimer is a central beta-sheet of 10 strands. The first nine strands of the sheet are mutually parallel with right-handed crossover connections between the strands. The 10th strand is antiparallel with respect to the first nine strands. In addition, the enzyme has two antiparallel beta-sheets, comprised of two strands and three strands each, 11 alpha-helices and two short 3/10-helices. Further, it has been suggested that the similarities in the GTP-binding domains of the synthetase and the p21ras protein are an example of convergent evolution of two distinct families of GTP-binding proteins. Structures of adenylosuccinate synthetase from Triticum aestivum and Arabidopsis thaliana when compared with the known structures from E. coli reveals that the overall fold is very similar to that of the E. coli protein.

    Proteins where this domain is known:
    PF13_0287   


    SM00803 - TAF (Smart link)

    Interpro entry IPR004823 : TATA box binding protein associated factor (TAF) (Interpro link)

    Interpro description:
    The TATA box binding protein associated factor (TAF) is part of the transcription initiation factor TFIID multimeric protein complex. TFIID plays a central role in mediating promoter responses to various activators and repressors. It binds tightly to TAFII-250 and directly interacts with TAFII-40. TFIID is composed of TATA binding protein (TBP)and a number of TBP-associated factors (TAFS). TAF proteins adopt a histone-like fold.

    Proteins where this domain is known:
    PF11_0061   


    SM00809 - Alpha_adaptinC2 (Smart link)

    Interpro entry IPR008152 : Clathrin adaptor, alpha/beta/gamma-adaptin, appendage, Ig-like subdomain (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    GGAs (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) are a family of monomeric clathrin adaptor proteins that are conserved from yeasts to humans. GGAs regulate clathrin-mediated the transport of proteins (such as mannose 6-phosphate receptors) from the TGN to endosomes and lysosomes through interactions with TGN-sorting receptors, sometimes in conjunction with AP-1. GGAs bind cargo, membranes, clathrin and accessory factors. GGA1, GGA2 and GGA3 all contain a domain homologous to the ear domain of gamma-adaptin. GGAs are composed of a single polypeptide with four domains: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The VHS domain is responsible for endocytosis and signal transduction, recognising transmembrane cargo through the ACLL sequence in the cytoplasmic domains of sorting receptors. The GAT domain (also found in Tom1 proteins) interacts with ARF (ADP-ribosylation factor) to regulate membrane trafficking, and with ubiquitin for receptor sorting. The hinge region contains a clathrin box for recognition and binding to clathrin, similar to that found in AP adaptins. The GAE domain is similar to the AP gamma-adaptin ear domain, and is responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, and the GAE (gamma-adaptin ear) domain of GGA adaptor proteins. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology. Although these domains share a similar fold, there is little sequence identity between the alpha/beta-adaptins and gamma-adaptin/GAE.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF14_0529   


    SM00815 - AMA-1 (Smart link)

    Interpro entry IPR003298 : Apical membrane antigen 1 (Interpro link)

    Interpro description:

    A novel antigen of Plasmodium falciparum has been cloned that contains a hydrophobic domain typical of an integral membrane protein. The antigen is designated apical membrane antigen 1 (AMA-1) by virtue of appearing to be located in the apical complex. AMA-1 appears to be transported to the merozoite surface close to the time of schizont rupture.

    The 66kDa merozoite surface antigen (PK66) of Plasmodium knowlesi, a simian malaria, possesses vaccine-related properties believed to originate from a receptor-like role in parasite invasion of erythrocytes. The sequence of PK66 is conserved throughout plasmodium, and shows high similarity to P. falciparum AMA-1. Following schizont rupture, the distribution of PK66 changes in a coordinate manner associated with merozoite invasion. Prior to rupture, the protein is concentrated at the apical end, following which it distributes itself entirely across the surface of the free merozoite. Immunofluorescence studies suggest that, during invasion, PK66 is excluded from the erythrocyte at, and behind, the invasion interface.

    Proteins where this domain is known:
    PF11_0344   


    SSF100920 - SSF100920 (Superfamily link)

    Proteins where this domain is known:
    MAL7P1.228    PF08_0054    PF11_0351    PFI0875w   


    SSF100934 - SSF100934 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.540    MAL7P1.228    PF07_0033    PF08_0054    PF11_0351    PFI0875w   


    SSF100950 - SSF100950 (Superfamily link)

    Proteins where this domain is known:
    PF08_0009    PF10_0136    PF14_0511    PFE0730c    PFL2160c    PFL2430c   


    SSF100966 - Transl_init_fac_IF2/IF5_N (Superfamily link)

    Interpro entry IPR016189 : Translation initiation factor IF2/IF5, N-terminal (Interpro link)

    Interpro description:

    The beta subunit of archaeal and eukaryotic translation initiation factor 2 (IF2beta) and the N-terminal domain of translation initiation factor 5 (IF5) show significant sequence homology. Archaeal IF2beta contains two independent structural domains: an N-terminal mixed alpha/beta core domain (topological similarity to the common core of ribosomal proteins L23 and L15e), and a C-terminal domain consisting of a zinc-binding C4 finger. Archaeal IF2beta is a ribosome-dependent GTPase that stimulates the binding of initiator Met-tRNA(i)(Met) to the ribosomes, even in the absence of other factors. The C-terminal domain of eukaryotic IF5 is involved in the formation of the multi-factor complex (MFC), an important intermediate for the 43S pre-initiation complex assembly. IF5 interacts directly with IF1, IF2beta and IF3c, which together with IF2-bound Met-tRNA(i)(Met) form the MFC.

    This entry represents the N-terminal alpha/beta domain found in IF2beta and IF5.

    Proteins where this domain is known:
    PF10_0103    PFL0335c   


    SSF101233 - SSF101233 (Superfamily link)

    Proteins where this domain is known:
    PFC0465c    PFF0505c    PFF1110c   


    SSF101238 - XPC-bd (Superfamily link)

    Interpro entry IPR015360 : XPC-binding domain (Interpro link)

    Interpro description:

    Members of this entry adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair.

    Proteins where this domain is known:
    PF10_0114   


    SSF101353 - Ala-tRNA-synth_IIc_anticod-bd (Superfamily link)

    Interpro entry IPR018162 : Alanyl-tRNA synthetase, class IIc, anti-codon-binding domain (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Proteins where this domain is known:
    PF13_0354   


    SSF101447 - FH2_actin_bd (Superfamily link)

    Interpro entry IPR015425 : (Interpro link)

    Interpro description:

    Formin homology (FH) proteins play a crucial role in the reorganization of the actin cytoskeleton, which mediates various functions of the cell cortex including motility, adhesion, and cytokinesis. Formins are multidomain proteins that interact with diverse signalling molecules and cytoskeletal proteins, although some formins have been assigned functions within the nucleus. Formins are characterised by the presence of three FH domains (FH1, FH2 and FH3), although members of the formin family do not necessarily contain all three domains. The proline-rich FH1 domain mediates interactions with a variety of proteins, including the actin-binding protein profilin, SH3 (Src homology 3) domain proteins, and WW domain proteins. The FH2 domain is required for the self-association of formin proteins through the ability of FH2 domains to directly bind each other, and may also act to inhibit actin polymerisation. The FH3 domain is less well conserved and may be important for determining intracellular localisation of formin family proteins. In addition, some formins can contain a GTPase-binding domain (GBD) required for binding to Rho small GTPases, and a C-terminal conserved Dia-autoregulatory domain (DAD).

    This entry represents the FH2 domain, which was shown by X-ray crystallography to have an elongated, crescent shape containing three helical subdomains.

    Proteins where this domain is known:
    PF14_0035    PFE1545c    PFL0925w   


    SSF101546 - Anti-silence (Superfamily link)

    Interpro entry IPR006818 : Histone chaperone, ASF1-like (Interpro link)

    Interpro description:

    This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a compact immunoglobulin-like beta sandwich fold topped by three helical linkers.

    Proteins where this domain is known:
    PFL1180w   


    SSF101744 - RNase_P_Rpp29 (Superfamily link)

    Interpro entry IPR002730 : Ribonuclease P/MRP, p29 subunit, eukaryotic/archaeal (Interpro link)

    Interpro description:

    This entry represents the p29 subunit (also known as Rpp29 or Pop4) of the related ribonucleoproteins ribonuclease (RNase) P and RNase MRP, which can be found in both eukaryotes and arachea. The structure of the RNase P subunit, Rpp29, from Methanobacterium thermoautotrophicum has been determined. Mth Rpp29 is a member of the oligonucleotide/oligosaccharide binding fold family. It contains a structured beta-barrel core and unstructured N- and C-terminal extensions bearing several highly conserved amino acid residues that could be involved in RNA contacts in the protein-RNA complex. Rpp29 catalyses the endonucleolytic cleavage of RNA, removing 5'-extranucleotides from tRNA precursor. It interacts with the Rpp25 and Pop5 subunits.

    RNase P is a ubiquitous ribonucleoprotein enzyme primarily responsible for cleaving the 5' leader sequence during maturation of tRNAs in all three domains of life. In eubacteria, this enzyme is made up of two subunits: a large RNA (approximately 120 kDa) responsible for mediating catalysis, and a small protein cofactor (approximately 15 kDa) that modulates substrate recognition and is required for efficient in vivo catalysis. In contrast, multiple proteins are associated with eukaryotic and archaeal RNase P, and these proteins exhibit no recognizable homology to the conserved bacterial protein subunit. In reconstitution experiments with recombinantly expressed and purified protein subunits Mth Rpp29, a homologue of the Rpp29 protein subunit from eukaryotic RNase P, is an essential protein component of the archaeal holoenzyme. In Saccharomyces cerevisiae (Baker's yeast), RNase P consists of 9 protein subunits (Pop1, Pop3-8, Rpr2 and Rpp1), while in humans there are 10 subunits (Rpp14, 20, 21, 25, 29, 30, 38, 40, hPop1, 5).

    RNase MRP (mitochondrial RNA processing) is an rRNA processing enzyme that cleaves a specific site within precursor rRNA to generate the mature 5'-end of 5.8S rRNA. RNase MRP also cleaves primers for mitochondrial DNA replication and CLB2 mRNA. In yeast, RNase MRP possesses one putatively catalytic RNA and at least 9 protein subunits and is highly related to RNase P (Pop1, Pop3-Pop8, Rpp1, Snm1 and Rmp1).

    Proteins where this domain is known:
    PFF1355w   


    SSF101790 - Aminomethyltransferase beta-barrel domain (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.390    PF13_0345   


    SSF101898 - SSF101898 (Superfamily link)

    Proteins where this domain is known:
    PFL1065c   


    SSF101904 - SSF101904 (Superfamily link)

    Proteins where this domain is known:
    PFL1120c   


    SSF101931 - EJC_Pym (Superfamily link)

    Interpro entry IPR015362 : (Interpro link)

    Interpro description:

    Members of this family adopt a structure consisting of a small globular all-beta-domain, with a three-stranded beta-sheet and a contiguous beta-hairpin. They bind to Mago alpha-helices via extensive electrostatic interactions and at a beta2-beta3 loop via hydrophobic interactions.

    Proteins where this domain is known:
    PFL0450c   


    SSF101960 - SSF101960 (Superfamily link)

    Proteins where this domain is known:
    PF11_0044   


    SSF101967 - SSF101967 (Superfamily link)

    Proteins where this domain is known:
    PFD0005w    PFF0655c   


    SSF102114 - SSF102114 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.220    PF14_0066    PFE1240w    PFF1070c    PFL1345c   


    SSF102405 - SSF102405 (Superfamily link)

    Proteins where this domain is known:
    PFD0670c   


    SSF102462 - SSF102462 (Superfamily link)

    Proteins where this domain is known:
    PFD0355c    PFF0515c   


    SSF102588 - SSF102588 (Superfamily link)

    Proteins where this domain is known:
    PFF1190c   


    SSF102645 - DNA/pantothenate-metab_flavo_C (Superfamily link)

    Interpro entry IPR007085 : (Interpro link)

    Interpro description:

    This entry represents the C-terminal domain found in DNA/pantothenate metabolism flavoproteins, which affects synthesis of DNA and pantothenate metabolism. These proteins contain ATP, phosphopantothenate, and cysteine binding sites. The structure of this domain has been determined in human phosphopantothenoylcysteine (PPC) synthetase and as the PPC synthase domain (CoaB) from the Escherichia coli coenzyme A bifunctional protein CoaBC. This domain adopts a 3-layer alpha/beta/alpha fold with mixed beta-sheets, which topologically resembles a combination of Rossmann-like and ribokinase-like folds. The structure of these proteins predicts a ping pong mechanism with initial formation of an acyladenylate intermediate, followed by release of pyrophosphate and attack by cysteine to form the final products PPC and AMP.

    Proteins where this domain is known:
    PF11_0036    PFD0610w   


    SSF102712 - SSF102712 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.343   


    SSF102741 - GTP-bd_prot_GTP1/OBG_C (Superfamily link)

    Interpro entry IPR015349 : GTP-binding protein GTP1/OBG, C-terminal (Interpro link)

    Interpro description:

    The Obg family comprises a group of ancient P-loop small G proteins (GTPases) belonging to the TRAFAC (for translation factors) class and can be subdivided into several distinct protein subfamilies. OBG GTPases have been found in both prokaryotes and eukaryotes. The structure of the OBG GTPase from Thermus thermophilus has been determined.

    This entry represents a C-terminal domain found in certain OBG GTPases. This domain contains a four-stranded beta sheet and three alpha helices flanked by an additional beta strand. It is predominantly found in the bacterial GTP-binding protein Obg, and is functionally uncharacterised.

    Proteins where this domain is known:
    PF14_0114   


    SSF102848 - SEP (Superfamily link)

    Interpro entry IPR012989 : (Interpro link)

    Interpro description:

    The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain.

    This domain has a 2-layer beta(3)-alpha(2)-beta fold, and is present in a number of other proteins as well, including FAF1 (Fas-associated factor 1) and undulin 2. Many of these proteins also contain the UBX domain C-terminal to the FAF domain. This domain is found in many eukaryotic proteins.

    Proteins where this domain is known:
    MAL8P1.122   


    SSF102886 - Coprogen_oxidas (Superfamily link)

    Interpro entry IPR001260 : Coproporphyrinogen III oxidase (Interpro link)

    Interpro description:
    Coprogen oxidase (i.e. coproporphyrin III oxidase or coproporphyrinogenase) catalyses the oxidative decarboxylation of coproporphyrinogen III to proto-porhyrinogen IX in the haem and chlorophyll biosynthetic pathways. The protein is a homodimer containing two internally bound iron atoms per molecule of native protein . The enzyme is active in the presence of molecular oxygen that acts as an electron acceptor). The enzyme is widely distributed having been found in a variety of eukaryotic and prokaryotic sources.

    Proteins where this domain is known:
    PF11_0436   


    SSF103025 - SSF103025 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.390    PF13_0345    PF14_0497   


    SSF103111 - AHSA1_N (Superfamily link)

    Interpro entry IPR015310 : Activator of Hsp90 ATPase, N-terminal (Interpro link)

    Interpro description:

    This domain is predominantly found in the protein 'Activator of Hsp90 ATPase', it adopts a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity.

    Proteins where this domain is known:
    PF13_0190    PFC0270w   


    SSF103263 - Chorismate_synth (Superfamily link)

    Interpro entry IPR000453 : Chorismate synthase (Interpro link)

    Interpro description:
    Chorismate synthase catalyzes the last of the seven steps in the shikimate pathway which is used in prokaryotes, fungi and plants for the biosynthesis of aromatic amino acids. It catalyzes the 1,4-trans elimination of the phosphate group from 5-enolpyruvylshikimate-3-phosphate (EPSP) to form chorismate which can then be used in phenylalanine, tyrosine or tryptophan biosynthesis. Chorismate synthase requires the presence of a reduced flavin mononucleotide (FMNH2 or FADH2) for its activity. Chorismate synthase from various sources shows a high degree of sequence conservation. It is a protein of about 360 to 400 amino-acid residues.

    Proteins where this domain is known:
    PFF1105c   


    SSF103365 - UPF0027 (Superfamily link)

    Interpro entry IPR001233 : (Interpro link)

    Interpro description:
    A number of uncharacterised proteins including Escherichia coli rtcB, Mycobacterium tuberculosis MtCY441.01., Caenorhabditis elegans F16A11.2 and Methanocaldococcus jannaschii (Methanococcus jannaschii) MJ0682 belong to this family.

    Proteins where this domain is known:
    PF11_0068    PFL1060c   


    SSF103456 - SecE_euk_arc (Superfamily link)

    Interpro entry IPR008158 : Protein translocase SEC61 complex gamma subunit (Interpro link)

    Interpro description:

    This family is the protein translocase SEC61 complex gamma subunit of the archaeal and eukaryotic type. It does not hit bacterial SecE proteins. Sec61 is required for protein translocation in the endoplasmic reticulum.

    The Sec61 complex (eukaryotes) or SecY complex (prokaryotes) forms a conserved heterotrimeric integral membrane protein complex and forms a protein-conducting channel that allows polypeptides to be transferred across (or integrated into) the endoplasmic reticulum (eukaryotes) or across the cytoplasmic membrane (prokaryotes). This complex is itself a part of a larger translocase heterotrimeric complex composed of alpha, beta and gamma subunits.

    The channel is a passive conduit for polypeptides. It therefore has to associate with other components that provide a driving force. The partner proteins in bacteria and eukaryotes differ. In bacteria, the translocase complex comprises 7 proteins, including a chaperone protein (SecB) an ATPase (SecA), an integral membrane complex (SecY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD) and SecF. The SecA ATPase interacts dynamically with the SecYEG integral membrane components to drive the transmembrane movement of newly synthesized preproteins. In yeast (and probably in all eukaryotes), the full translocase comprises another membrane protein subcomplex (the tetrameric Sec62/63p complex), and the lumenal protein BiP, a member of the Hsp70 family of ATPases. BiP promotes translocation by acting as a molecular ratchet, preventing the polypeptide chain from sliding back into the cytosol.

    Proteins where this domain is known:
    PFB0450w   


    SSF103473 - MFS_gen_substrate_transporter (Superfamily link)

    Interpro entry IPR016196 : (Interpro link)

    Interpro description:

    This entry represents the major facilitator superfamily (MFS) domain, which consists of twelve transmembrane helices. MFS proteins are the largest group of secondary membrane transporters in the cell. Among the different families of transporters, only two occur ubiquitously in all classifications of organisms; these are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients. The MFS family contains members that function as uniporters, symporters or antiporters. In addition their solute specificity are also diverse. MFS proteins contain 12 transmembrane regions (with some variations).

    This domain can be found in glycerol-3-phosphate transporter from Escherichia coli, which transports glycerol-3-phosphate into the cytoplasm and inorganic phosphate into the periplasm. The E. coli proton/sugar transporter lactose permease (LacY) also carries this domain, and acts to couple lactose and H+ translocation..

    Proteins where this domain is known:
    MAL8P1.13    PF10_0034    PF10_0215    PF11_0059    PF11_0172    PF11_0176    PF11_0310    PF13_0019    PF14_0260    PF14_0387    PFA0160c    PFA0240w    PFA0245w    PFB0210c    PFB0275w    PFB0465c    PFC0240c    PFC0530w    PFE0825w    PFE1455w    PFF0690c    PFI0720w    PFI0785c    PFI0955w    PFI1295c    PFL0170w   


    SSF103481 - SSF103481 (Superfamily link)

    Proteins where this domain is known:
    PF07_0064    PF07_0070    PF11_0141    PF11_0333    PF11_0530    PF14_0638    PFE0410w    PFE1130w   


    SSF103486 - ATPase_V0/A0_c/d (Superfamily link)

    Interpro entry IPR002843 : ATPase, V0/A0 complex, subunit C/D (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    The V-ATPases (or V1V0-ATPase) and A-ATPases (or A1A0-ATPase) are each composed of two linked complexes: the V1 or A1 complex contains the catalytic core that hydrolyses/synthesizes ATP, and the V0 or A0 complex that forms the membrane-spanning pore. The V- and A-ATPases both contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis . The V- and A-ATPases more closely resemble one another in subunit structure than they do the F-ATPases, although the function of A-ATPases is closer to that of F-ATPases.

    This entry represents subunit C from the A0 complex of A-ATPases, and subunits C and D from the V0 complex of V-ATPases, all of which are involved in the translocation of protons across a membrane. There is more than one type of D subunit in V-ATPases, where the D1 subunit is ubiquitous, while the D2 subunit has limited tissue expressivity, possibly to account for differential functions, targeting or regulation of V-ATPase activity .

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF14_0615   


    SSF103491 - SecY (Superfamily link)

    Interpro entry IPR002208 : SecY protein (Interpro link)

    Interpro description:

    Secretion across the inner membrane in some Gram-negative bacteria occurs via the preprotein translocase pathway. Proteins are produced in the cytoplasm as precursors, and require a chaperone subunit to direct them to the translocase component.. From there, the mature proteins are either targeted to the outer membrane, or remain as periplasmic proteins. The translocase protein subunits are encoded on the bacterial chromosome.

    The translocase itself comprises 7 proteins, including a chaperone protein (SecB), an ATPase (SecA), an integral membrane complex (SecCY, SecE and SecG), and two additional membrane proteins that promote the release of the mature peptide into the periplasm (SecD and SecF). The chaperone protein SecB is a highly acidic homotetrameric protein that exists as a "dimer of dimers" in the bacterial cytoplasm. SecB maintains preproteins in an unfolded state after translation, and targets these to the peripheral membrane protein ATPase SecA for secretion. The structure of the Escherichia coli SecYEG assembly revealed a sandwich of two membranes interacting through the extensive cytoplasmic domains. Each membrane is composed of dimers of SecYEG. The monomeric complex contains 15 transmembrane helices.

    The eubacterial secY protein interacts with the signal sequences of secretory proteins as well as with two other components of the protein translocation system: secA and secE. SecY is an integral plasma membrane protein of 419 to 492 amino acid residues that apparently contains 10 transmembrane (TM), 6 cytoplasmic and 5 periplasmic regions.

    Cytoplasmic regions 2 and 3, and TM domains 1, 2, 4, 5, 7 and 10 are well conserved: the conserved cytoplasmic regions are believed to interact with cytoplasmic secretion factors, while the TM domains may participate in protein export. Homologs of secY are found in archaebacteria. SecY is also encoded in the chloroplast genome of some algae where it could be involved in a prokaryotic-like protein export system across the two membranes of the chloroplast endoplasmic reticulum (CER) which is present in chromophyte and cryptophyte algae.

    Proteins where this domain is known:
    MAL13P1.231   


    SSF103506 - Mitoch_carrier (Superfamily link)

    Interpro entry IPR001993 : Mitochondrial substrate carrier (Interpro link)

    Interpro description:

    A variety of substrate carrier proteins that are involved in energy transfer are found in the inner mitochondrial membrane or integral to the membrane of other eukaryotic organelles such as the peroxisome. Such proteins include: ADP, ATP carrier protein (ADP/ATP translocase); 2-oxoglutarate/malate carrier protein; phosphate carrier protein; tricarboxylate transport protein (or citrate transport protein); Graves disease carrier protein; yeast mitochondrial proteins MRS3 and MRS4; yeast mitochondrial FAD carrier protein; and many others. Structurally, these proteins can consist of up to three tandem repeats of a domain of approximately 100 residues, each domain containing two transmembrane regions.

    Proteins where this domain is known:
    PF08_0031    PF08_0093    PF10_0051    PF10_0366    PF13_0359    PFA0415c    PFA0435w    PFD0367w    PFI0255c    PFI0425w    PFL0110c    PFL1145w    PFL2000w   


    SSF109604 - SSF109604 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.118    MAL13P1.119    PF14_0672    PFL0475w    PFL1890c   


    SSF109728 - SSF109728 (Superfamily link)

    Proteins where this domain is known:
    PF14_0107   


    SSF109905 - SSF109905 (Superfamily link)

    Proteins where this domain is known:
    PF14_0028    PF14_0713   


    SSF109910 - DUF339 (Superfamily link)

    Interpro entry IPR005631 : (Interpro link)

    Interpro description:

    This entry represents a group of uncharacterised small proteins found in both eukaryotes and prokaryotes, including NMA1147 from Neisseria meningitidis and YgfY from Escherichia coli. YgfY may be involved in transcriptional regulation. The structure of these proteins consists of a complex bundle of five alpha-helices, which is composed of an up-down 3-helix bundle plus an orthogonal 2-helix bundle.

    Proteins where this domain is known:
    MAL7P1.154   


    SSF109993 - SSF109993 (Superfamily link)

    Proteins where this domain is known:
    MAL8P1.82    PF11_0403   


    SSF109998 - Trigger_fac_C_bac (Superfamily link)

    Interpro entry IPR008880 : Trigger factor, C-terminal, bacterial (Interpro link)

    Interpro description:

    In the Escherichia coli cytosol, a fraction of the newly synthesised proteins requires the activity of molecular chaperones for folding to the native state. The major chaperones implicated in this folding process are the ribosome-associated Trigger Factor (TF), and the DnaK and GroEL chaperones with their respective co-chaperones. Trigger Factor is an ATP-independent chaperone and displays chaperone and peptidyl-prolyl-cis-trans-isomerase (PPIase) activities in vitro. It is composed of at least three domains, an N-terminal domain which mediates association with the large ribosomal subunit, a central substrate binding and PPIase domain with homology to FKBP proteins, and a C-terminal domain of unknown function. The positioning of TF at the peptide exit channel, together with its ability to interact with nascent chains as short as 57 residues renders TF a prime candidate for being the first chaperone that binds to the nascent polypeptide chains.

    This entry represents the C-terminal domain of bacterial trigger factor proteins, which has a multi-helical structure consisting of an irregular array of long and short helices. This domain is structurally similar to the peptide-binding domain of the bacterial porin chaperone SurA.

    Proteins where this domain is known:
    PF14_0249   


    SSF110004 - Glycolipid_transfer_prot (Superfamily link)

    Interpro entry IPR014830 : Glycolipid transfer protein, GLTP (Interpro link)

    Interpro description:

    Glycolipid transfer protein (GLTP) is a cytosolic protein that catalyses the intermembrane transfer of glycolipids such as glycosphingolipids, glyceroglycolipids, and possibly glucosylceramides, but not of phospholipids. GLTP has a multi-helical structure consisting of two layers of orthogonally packed helices.

    Proteins where this domain is known:
    PFI0775w   


    SSF110019 - ERO1 (Superfamily link)

    Interpro entry IPR007266 : Endoplasmic reticulum oxidoreductin 1 (Interpro link)

    Interpro description:
    Members of this family are required for the formation of disulphide bonds in the endoplasmic reticulum.

    Proteins where this domain is known:
    PF11_0251   


    SSF110111 - CtaG_Cox11 (Superfamily link)

    Interpro entry IPR007533 : Cytochrome c oxidase assembly protein CtaG/Cox11 (Interpro link)

    Interpro description:
    Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C-terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae . Met 224 is also thought to play a role in copper transfer or stabilising the copper site.

    Proteins where this domain is known:
    PF14_0721   


    SSF110296 - SSF110296 (Superfamily link)

    Proteins where this domain is known:
    PF14_0493   


    SSF110324 - Ribosomal_L27 (Superfamily link)

    Interpro entry IPR001684 : Ribosomal protein L27 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L27 is a protein from the large (50S) subunit; it is essential for ribosome function, but its exact role is unclear. It belongs to a family of ribosomal proteins, examples of which are found in bacteria, chloroplasts of plants and red algae and the mitochondria of fungi (e.g. MRP7 from yeast mitochondria). The schematic relationship between these groups of proteins is shown below.

    Proteins where this domain is known:
    PF10_0332    PFC0701w   


    SSF110942 - SSF110942 (Superfamily link)

    Proteins where this domain is known:
    PF07_0029    PF11_0188    PF14_0417    PFL1070c   


    SSF110993 - SSF110993 (Superfamily link)

    Proteins where this domain is known:
    PF07_0117   


    SSF111278 - DUF207 (Superfamily link)

    Interpro entry IPR003827 : (Interpro link)

    Interpro description:

    The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3'-position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner.

    Proteins where this domain is known:
    PF10_0179   


    SSF111321 - DUF89 (Superfamily link)

    Interpro entry IPR002791 : (Interpro link)

    Interpro description:

    This entry contains uncharacterised proteins. Those with structural information consist of two domains: an all-alpha domain with a 3-helical bundle fold, and an alpha-beta domain in 3 layers, alpha/beta/alpha.

    Proteins where this domain is known:
    PFB0425c   


    SSF111331 - ATP-NAD_kinase_PpnK-typ (Superfamily link)

    Interpro entry IPR016064 : ATP-NAD kinase, PpnK-type (Interpro link)

    Interpro description:

    ATP-NAD kinases catalyse the phosphorylation of NAD to NADP utilizing ATP and other nucleoside triphosphates as well as inorganic polyphosphate as a source of phosphorus. ATP-NAD kinase contains two domains, where domain 1 has an alpha/beta topology that is related in structure to the N-terminal of phosphofructokinase, and domain 2 has an atypical beta-sandwich topology made of four structural repeats of beta(3) units.

    Proteins where this domain is known:
    PFI0650c   


    SSF46548 - Helical_ferredxn (Superfamily link)

    Interpro entry IPR009051 : Alpha-helical ferredoxin (Interpro link)

    Interpro description:

    The alpha-helical ferredoxin domain contains two Fe4-S4 clusters, typical of bacterial ferredoxin. Iron-sulphur proteins play an important role in electron transfer processes and in various enzymatic reactions. In eukaryotes, the mitochondria are the major site of Fe-S cluster biosynthesis in the cell, used for the assembly of mitochondrial and non-mitochondrial Fe-S proteins. The alpha-helical ferredoxin domain is present in several proteins involved in redox reactions, including the C-terminal of the respiratory proteins succinate dehydrogenase (SQR) in bacteria/mitochondria, and fumarate reductase (QFR) in bacteria. SQR is analogous to the mitochondrial respiratory complex II, and is involved in the electron transport pathway from succinate as a donor to the acceptor ubiquinone. SQR helps prevent the formation of reactive oxygen species and is used during aerobic respiration, whereas QFR does not and, consequently, is used to catalyse the final step of anaerobic respiration using the acceptor fumarate.

    The alpha-helical ferredoxin domain is also present in the N-terminal of the cytosolic protein dihydropyrimidine dehydrogenase, (DPD) which catalyses the NADPH-dependent, rate-limiting step in pyrimidine degradation, converting pyrimidines to 5,6-dihydro compounds. DPD catalysis involves electron transfer from NADPH to the substrate via the Fe4-S4 centre and FAD. In mammals, this pathway produces the neurotransmitter beta-alanine.

    Proteins where this domain is known:
    PF14_0334    PFL0630w   


    SSF46561 - Ribosomal_L29 (Superfamily link)

    Interpro entry IPR001854 : Ribosomal protein L29 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L29 is one of the proteins from the large ribosomal subunit. L29 belongs to a family of ribosomal proteins of 63 to 138 amino-acid residues which, on the basis of sequence similarities, groups:

    Proteins where this domain is known:
    PF11_0260    PFL0400w   


    SSF46565 - DnaJ_N (Superfamily link)

    Interpro entry IPR001623 : Heat shock protein DnaJ, N-terminal (Interpro link)

    Interpro description:

    The prokaryotic heat shock protein DnaJ interacts with the chaperone hsp70-like DnaK protein. Structurally, the DnaJ protein consists of an N-terminal conserved domain (called 'J' domain) of about 70 amino acids, a glycine-rich region ('G' domain') of about 30 residues, a central domain containing four repeats of a CXXCXGXG motif ('CRR' domain) and a C-terminal region of 120 to 170 residues.

    Such a structure is shown in the following schematic representation:

    It is thought that the 'J' domain of DnaJ mediates the interaction with the dnaK protein and consists of four helices, the second of which has a charged surface that includes at least one pair of basic residues that are essential for interaction with the ATPase domain of Hsp70. The J- and CRR-domains are found in many prokaryotic and eukaryotic proteins, either together or separately. In yeast, J-domains have been classified into 3 groups; the class III proteins are functionally distinct and do not appear to act as molecular chaperones.

    Proteins where this domain is known:
    MAL13P1.162    MAL13P1.277    MAL8P1.204    PF07_0103    PF08_0032    PF08_0115    PF10_0032    PF10_0058    PF10_0378    PF10_0381    PF11_0034    PF11_0099    PF11_0273    PF11_0380    PF11_0433    PF11_0443    PF11_0509    PF11_0512    PF11_0513    PF13_0036    PF13_0102    PF14_0013    PF14_0111    PF14_0137    PF14_0213    PF14_0359    PF14_0700    PFA0110w    PFA0660w    PFA0675w    PFB0085c    PFB0090c    PFB0595w    PFB0920w    PFB0925w    PFD0462w    PFE0055c    PFE0135w    PFE1170w    PFF1010c    PFF1415c    PFI0855w    PFI0935w    PFI0985c    PFL0055c    PFL0565w    PFL0815w    PFL2550w   


    SSF46579 - Prefoldin (Superfamily link)

    Interpro entry IPR009053 : (Interpro link)

    Interpro description:

    The Prefoldin/GimC family of proteins are found in eukaryotes and archaea. Prefoldin is part of a molecular chaperone system that promotes the correct folding of nascent polypeptide chains. Prefoldin/GimC interacts with the nascent chain to stabilise it prior to its folding within the central cavity of a chaperonin. Prefoldin/GimC is a hexamer consisting of two types of subunits, alpha and beta. Archaeal prefoldin contains one type of alpha and one type of beta subunit, while eukaryotic prefoldin/GimC contains two different but related alpha subunits and four related beta subunits.

    Proteins where this domain is known:
    MAL7P1.94    PF11_0292    PF14_0167    PFE0290c    PFE0595w    PFI0350c   


    SSF46589 - tRNA_binding_arm (Superfamily link)

    Interpro entry IPR010978 : tRNA-binding arm (Interpro link)

    Interpro description:

    This entry represents an alpha-helical tRNA-binding arm found in class I and II aminoacyl-tRNA synthetase enzymes, as well as in the methicillin resistance protein FemA.

    The tRNA-binding arm domain is conserved between class I and class II aminoacyl-tRNA synthetase enzymes, consisting of two alpha helices in an antiparallel hairpin with a left-handed twist. The appended tRNA-binding domains recognize a small number of nucleotides that are conserved specifically in each cognate tRNA species for the discrimination between the cognate and noncognate tRNAs. These nucleotides are called identity elements, and constitute the identity set. The tRNA-binding arm occurs as the C-terminal domain in some class I enzymes, such as valyl-tRNA synthetase, and as the N-terminal domain in some class II enzymes, such as phenylalanyl-tRNA synthetase.

    The methicillin resistance protein, FemA (factors essential for methicillin resistance), contains a probable tRNA-binding arm that is similar in structure to those found in tRNA synthetases. In FemA, the tRNA-binding arm is inserted into the C-terminal NAT-like domain, and is thought to bind tRNA-glycine. FemA, along with FemB and FemX, plays a vital role in peptidoglycan biosynthesis specific to Staphylococci.

    Proteins where this domain is known:
    PF07_0073   


    SSF46609 - SODismutase (Superfamily link)

    Interpro entry IPR001189 : Manganese and iron superoxide dismutase (Interpro link)

    Interpro description:

    Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.

    The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).

    Proteins where this domain is known:
    PF08_0071    PFF1130c   


    SSF46626 - Cytochrome_c (Superfamily link)

    Interpro entry IPR009056 : Cytochrome c, monohaem (Interpro link)

    Interpro description:

    After cytochrome c is synthesized in the cytoplasm as apocytochrome c, it is transported through the outer mitochondrial membrane to the intermembrane space, where haem is covalently attached by thioester bonds to two cysteine residues located in the cytochrome c centre. Cytochrome c is required during oxidative phosphorylation as an electron shuttle between Complex III (cytochrome c reductase) and IV (cytochrome c oxidase). In addition, cytochrome c is involved in apoptosis in more complex organisms such as Xenopus, rats and humans. Cellular stress can induce cytochrome c release from the mitochondrial membrane. In mammals, cytochrome c triggers the assembly of the apoptosome, consisting of cytochrome c, Apaf-1 and dATP, which activates caspase-9, leading to cell death. There are several different members of the cytochrome c family with different functional roles, for instance cytochrome c549 is associated with photosystem II.

    The known structures of c-type cytochromes have six different classes of fold. Of these, four are unique to c-type cytochromes. The consensus sequence for the cytochrome c centre is Cys-X-X-Cys-His, where the histidine residue is one of the two axial ligands of the haem iron. This arrangement is shared by all proteins known to belong to the cytochrome c family, which presently includes both mono-haem proteins and multi-haem proteins. This entry represents mono-haem cytochrome c proteins (excluding class II and f-type cytochromes), such as cytochromes c, c1, c2, c5, c555, c550 to c553, c556, and c6.

    Cytochrome c-type centres are also found in the active sites of many enzymes, including cytochrome cd1-nitrite reductase as the N-terminal haem c domain, in quinoprotein alcohol dehydrogenase as the C-terminal domain, in Quinohemoprotein amine dehydrogenase A chain as domains 1 and 2, and in the cytochrome bc1 complex as the cytochrome bc1 domain.

    Proteins where this domain is known:
    MAL13P1.55    PF14_0038    PF14_0597    PFI1830c   


    SSF46689 - Homeodomain_like (Superfamily link)

    Interpro entry IPR009057 : (Interpro link)

    Interpro description:

    Homeodomain proteins are transcription factors that share a related DNA binding homeodomain. The homeodomain was first identified in a number of Drosophila homeotic and segmentation proteins, but is now known to be well conserved in many other animals, including vertebrates. The domain binds DNA through a helix-turn-helix (HTH) structure. The HTH motif is characterised by two alpha-helices, which make intimate contacts with the DNA and are joined by a short turn. The second helix binds to DNA via a number of hydrogen bonds and hydrophobic interactions, which occur between specific side chains and the exposed bases and thymine methyl groups within the major groove of the DNA. The first helix helps to stabilise the structure. Many proteins contain homeodomains, including Drosophila Engrailed, yeast mating type proteins, hepatocyte nuclear factor 1a and HOX proteins.

    The homeodomain motif is very similar in sequence and structure to domains in a wide range of DNA-binding proteins, including recombinases, Myb proteins, GARP response regulators, human telomeric proteins (hTRF1), paired domain proteins (PAX), yeast RAP1, centromere-binding proteins CENP-B and ABP-1, transcriptional regulators (TyrR), AraC-type transcriptional activators, and tetracycline repressor-like proteins (TetR, QacR, YcdC).

    Proteins where this domain is known:
    PF10_0143    PF10_0327    PF11_0053    PF11_0241    PF13_0088    PFC1120c    PFF1385c    PFI1480w    PFL0290w    PFL0815w    PFL1215c   


    SSF46774 - ARID (Superfamily link)

    Interpro entry IPR001606 : AT-rich interaction region (Interpro link)

    Interpro description:

    Members of the recently discovered ARID (AT-rich interaction domain) family of DNA-binding proteins are found in fungi and invertebrate and vertebrate metazoans. ARID-encoding genes are involved in a variety of biological processes including embryonic development, cell lineage gene regulation and cell cycle control. Although the specific roles of this domain and of ARID-containing proteins in transcriptional regulation are yet to be elucidated, they include both positive and negative transcriptional regulation and a likely involvement in the modification of chromatin structure. The basic structure of the ARID domain domain appears to be a series of six alpha-helices separated by beta-strands, loops, or turns, but the structured region may extend to an additional helix at either or both ends of the basic six. Based on primary sequence homology, they can be partitioned into three structural classes: Minimal ARID proteins that consist of a core domain formed by six alpha helices; ARID proteins that supplement the core domain with an N-terminal alpha-helix; and Extended-ARID proteins, which contain the core domain and additional alpha-helices at their N- and C-termini.

    The human SWI-SNF complex protein p270 is an ARID family member with non-sequence-specific DNA binding activity. The ARID consensus and other structural features are common to both p270 and yeast SWI1, suggesting that p270 is a human counterpart of SWI1. The approximately 100-residue ARID sequence is present in a series of proteins strongly implicated in the regulation of cell growth, development, and tissue-specific gene expression. Although about a dozen ARID proteins can be identified from database searches, to date, only Bright (a regulator of B-cell-specific gene expression), dead ringer (a Drosophila melanogaster gene product required for normal development), and MRF-2 (which represses expression from the Cytomegalovirus enhancer) have been analyzed directly in regard to their DNA binding properties. Each binds preferentially to AT-rich sites. In contrast, p270 shows no sequence preference in its DNA binding activity, thereby demonstrating that AT-rich binding is not an intrinsic property of ARID domains and that ARID family proteins may be involved in a wider range of DNA interactions.

    Proteins where this domain is known:
    PFF0175c   


    SSF46785 - SSF46785 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.190    PF08_0094    PF10_0174    PF11_0303    PF11_0469    PF13_0237    PF14_0025    PF14_0261    PF14_0278    PF14_0327    PFC0441c    PFD0880w    PFD0975w    PFD1055w    PFE1405c    PFF1445c    PFL0310c    PFL0625c   


    SSF46906 - Ribosomal_L11 (Superfamily link)

    Interpro entry IPR000911 : Ribosomal protein L11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.

    Proteins where this domain is known:
    PF11_0113    PFE0850c   


    SSF46911 - Ribosomal_S18 (Superfamily link)

    Interpro entry IPR001648 : Ribosomal protein S18 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    The small ribosomal subunit protein S18 is known to be involved in binding the aminoacyl-tRNA complex in Escherichia coli, and appears to be situated at the tRNA A-site. Experimental evidence has revealed that S18 is well exposed on the surface of the E. coli ribosome, and is a secondary rRNA binding protein. S18 belongs to a family of ribosomal proteins that includes: eubacterial S18; metazoan mitochondrial S18, algal and plant chloroplast S18; and cyanelle S18.

    Proteins where this domain is known:
    PFL0570c   


    SSF46924 - RNA_pol_N/8_sub (Superfamily link)

    Interpro entry IPR000268 : RNA polymerases, N/8 Kd subunits (Interpro link)

    Interpro description:
    In eukaryotes, there are three different forms of DNA-dependent RNA polymerases transcribing different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. In archaebacteria, there is generally a single form of RNA polymerase which also consists of an oligomeric assemblage of 10 to 13 polypeptides. Archaebacterial subunit N (gene rpoN) is a small protein of about 8 kDa, it is evolutionary related to a 8.3 kDa component shared by all three forms of eukaryotic RNA polymerases (gene RPB10 in yeast and POLR2J in mammals) as well as to African swine fever virus (ASFV) protein CP80R. There is a conserved region which is located at the N-terminal extremity of these polymerase subunits; this region contains two cysteines that binds a zinc ion.

    Proteins where this domain is known:
    PF07_0027   


    SSF46934 - UBA_like (Superfamily link)

    Interpro entry IPR009060 : (Interpro link)

    Interpro description:

    UBA domains are a commonly occurring sequence motif of approximately 45 amino acid residues that are found in diverse proteins involved in the ubiquitin/proteasome pathway, DNA excision-repair, and cell signalling via protein kinases. HHR23A, the human homologue of yeast Rad23A is a nucleotide excision-repair protein that contains both an internal and a C-terminal UBA domain. The fold of the UBA domain consists of a compact three-helical bundle with a right-handed twist, and have a conserved hydrophobic surface patch for protein-protein interactions. UBA-like domains can be found in other proteins as well, such as the TS-N domain in the elongation factor Ts (EF-Ts), which catalyses the recycling of the GTPase EF-Tu required for the binding of aminoacyl-tRNA top the ribosomal A site; and the C-terminal domain of TAP/NXF1, which functions in nuclear export through the interaction of its UBA-like domain with FG nucleoporins.

    Proteins where this domain is known:
    PF10_0114    PF11_0329    PF13_0301    PFC0225c    PFD0655c   


    SSF46938 - Sec14p_like_N (Superfamily link)

    Interpro entry IPR011074 : (Interpro link)

    Interpro description:

    This entry contains the phosphatidylinositol transfer protein, Sec14p, which catalyses the exchange of phosphatidylinositol and phosphatidylcholine between membrane bilayers in vitro. Other related proteins include the retinaldehyde/retinal-binding proteins, which are functional components of the visual cycle, guanine nucleotide exchange factor, and alpha-tocopherol transfer protein, which enhances the transfer of ligand between separate membranes, as well as several hypothetical proteins.

    Proteins where this domain is known:
    PF11_0287   


    SSF46942 - TFIIS_centre (Superfamily link)

    Interpro entry IPR003618 : Transcription elongation factor S-II, central region (Interpro link)

    Interpro description:

    Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

    TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    This domain is found in the central region of transcription elongation factor S-II and in several hypothetical proteins.

    Proteins where this domain is known:
    PF07_0057    PF11_0289   


    SSF46946 - Ribosomal_H2TH (Superfamily link)

    Interpro entry IPR010979 : Ribosomal protein S13-like, H2TH (Interpro link)

    Interpro description:

    Ribosomal protein S13 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S13 is known to be involved in binding fMet-tRNA and, hence, in the initiation of translation. S13 contains thee helices and a beta-hairpin in the core of the protein, which form a helix-two turns-helix (H2TH) motif, and a non-globular C-terminal extension.

    This H2TH motif can be found in other proteins as well. In the DNA repair protein, MutM (formamidopyrimidine DNA glycosylase; Fpg), the middle domain contains the H2TH motif. MutM is a trifunctional DNA base excision repair enzyme that removes a wide range of oxidatively damaged bases (N-glycosylase activity) and cleaves both the 3'- and 5'-phosphodiester bonds of the resulting apurinic/apyrimidinic site (AP lyase activity). Other repair enzymes, such as E. coli Endonuclease VIII that excises oxidized pyrimidines from DNA, also contain a DNA-binding H2TH motif within the middle domain. The H2TH domains of these repair proteins are only peripherally involved in binding DNA; their primary function may be simply to position the N-terminal lobe and C-terminal zinc finger domain of the glycosylases for interactions with DNA.

    The middle domain of topoisomerase IV-B subunit contains a H2TH motif that is structurally related to the DNA repair proteins. Although the H2TH domain appears to be retained in all archaeal and plant type IIB topoisomerases identified to date, it has no known function and has not been observed in other topoisomerase families.

    Proteins where this domain is known:
    PF11_0272   


    SSF46950 - TFAR19-related (Superfamily link)

    Interpro entry IPR002836 : DNA-binding TFAR19-related protein (Interpro link)

    Interpro description:

    This protein family is found in archaea and eukaryota. The human TFAR19 encodes a protein which shares significant homology to the corresponding proteins of species ranging from yeast to mice. TFAR19 exhibits a ubiquitous expression pattern and its expression is up-regulated in the tumour cells undergoing apoptosis. TFAR19 may play a general role in the apoptotic process. Also included in this family is a DNA-binding protein from the archaea, Methanobacterium thermoautotrophicum.

    Proteins where this domain is known:
    PFI0450c   


    SSF46955 - Putativ_DNA_bind (Superfamily link)

    Interpro entry IPR009061 : Putative DNA binding (Interpro link)

    Interpro description:

    A putative DNA-binding domain with a conserved structure is found in several different protein families. The core structure of the domain consists of a three-helical fold that is architecturally similar to that of the "winged-helix" fold, but is topologically distinct. Representatives of this domain can be found in domains B1 and B5 from the beta subunit of phenylalanine-tRNA synthetases, the C-terminal region of the DNA/RPA-binding domain of the DNA excision repair factor XPA, the N-terminal domain of the transcriptional activators BmrR and MtaN, the most conserved domain of the retinal development protein Dachshund, and the DNA-binding domain of the gpNU1 subunit from the bacteriophage lambda viral packing protein terminase.

    Proteins where this domain is known:
    MAL7P1.32    PF11_0051   


    SSF46977 - Succ_DH_flav_C (Superfamily link)

    Interpro entry IPR015939 : Fumarate reductase/succinate dehydrogenase flavoprotein-like, C-terminal (Interpro link)

    Interpro description:

    This entry represents a domain with a spectrin-repeat-like fold consisting of three helices in a closed bundle with a left-handed twist. This domain is found in the succinate dehydrogenase/fumarate reductase oxidoreductase family of proteins, such as:

    Proteins where this domain is known:
    PF10_0334   


    SSF46988 - Tub_bind_cof_A (Superfamily link)

    Interpro entry IPR004226 : Tubulin binding cofactor A (Interpro link)

    Interpro description:

    The folding pathway of tubulins includes highly specific interactions with a series of cofactors (A, B, C, D and E) after they are released from the eukaryotic chaperonin CCT. Cofactors A and D capture and stabilise tubulin in a quasi-native conformation. Cofactor E binds to the cofactor D-tubulin complex, and interaction with cofactor C then causes the release of tubulin poypeptides in the native state. This family is the tubulin-specific chaperone A.

    Proteins where this domain is known:
    PFA0460c   


    SSF47027 - ACBP (Superfamily link)

    Interpro entry IPR000582 : Acyl-CoA-binding protein, ACBP (Interpro link)

    Interpro description:

    Acyl-CoA-binding protein (ACBP) is a small (10 Kd) protein that binds medium- and long-chain acyl-CoA esters with very high affinity and may function as an intracellular carrier of acyl-CoA esters. ACBP is also known as diazepam binding inhibitor (DBI) or endozepine (EP) because of its ability to displace diazepam from the benzodiazepine (BZD) recognition site located on the GABA type A receptor. It is therefore possible that this protein also acts as a neuropeptide to modulate the action of the GABA receptor.

    ACBP is a highly conserved protein of about 90 residues that is found in all four eukaryotic kingdoms, Animalia, Plantae, Fungi and Protista, and in some eubacterial species.

    Although ACBP occurs as a completely independent protein, intact ACB domains have been identified in a number of large, multifunctional proteins in a variety of eukaryotic species. These include large membrane-associated proteins with N-terminal ACB domains, multifunctional enzymes with both ACB and peroxisomal enoyl-CoA Delta(3), Delta(2)-enoyl-CoA isomerase domains, and proteins with both an ACB domain and ankyrin repeats.

    The ACB domain consists of four alpha-helices arranged in a bowl shape with a highly exposed acyl-CoA-binding site. The ligand is bound through specific interactions with residues on the protein, most notably several conserved positive charges that interact with the phosphate group on the adenosine-3'phosphate moiety, and the acyl chain is sandwiched between the hydrophobic surfaces of CoA and the protein.

    Other proteins containing an ACB domain include:

    Proteins where this domain is known:
    PF08_0099    PF10_0015    PF10_0016    PF11_0197    PF14_0749   


    SSF47060 - S15/NS1_bind (Superfamily link)

    Interpro entry IPR009068 : (Interpro link)

    Interpro description:

    The RNA-binding domains of the ribosomal protein S15 and the influenza virus non-structural protein NS1 share the same structural fold, consisting of three helices in an irregular array. S15 is one of 21 proteins in the small, bacterial 30S ribosomal subunit, and is required for assembly of the subunit through its binding to 16S rRNA. The multifunctional glutamyl-prolyl-tRNA synthase (EPRS) contains three tandem repeats linking two catalytic domains, all three of which contribute to RNA-binding; the second repeated element bears structural resemblance to the S15/NS1 RNA-binding domain.

    Proteins where this domain is known:
    PF11_0072    PF13_0059    PF13_0316   


    SSF47095 - HMG-box (Superfamily link)

    Interpro entry IPR009071 : (Interpro link)

    Interpro description:

    High mobility group (HMG) box domains are involved in binding DNA, and may be involved in protein-protein interactions as well. The structure of the HMG-box domain consists of three helices in an irregular array. HMG-box domains are found in one or more copies in HMG-box proteins, which form a large, diverse family involved in the regulation of DNA-dependent processes such as transcription, replication, and strand repair, all of which require the bending and unwinding of chromatin. Many of these proteins are regulators of gene expression. HMG-box proteins are found in a variety of eukaryotic organisms, and can be broadly divided into two groups, based on sequence-dependent and sequence-independent DNA recognition; the former usually contain one HMG-box motif, while the latter can contain multiple HMG-box motifs.

    HMG-box domains can be found in single or multiple copies in the following protein classes: HMG1 and HMG2 non-histone components of chromatin; SRY (sex determining region Y protein) involved in differential gonadogenesis; the SOX family of transcription factors; sequence-specific LEF1 (lymphoid enhancer binding factor 1) and TCF-1 (T-cell factor 1) involved in regulation of organogenesis and thymocyte differentiation; structure-specific recognition protein SSRP involved in transcription and replication; MTF1 mitochondrial transcription factor; nucleolar transcription factors UBF 1/2 (upstream binding factor) involved in transcription by RNA polymerase I; Abf2 yeast ARS-binding factor; yeast transcription factors lxr1, Rox1, Nhp6b and Spp41; mating type proteins (MAT) involved in the sexual reproduction of fungi; and the YABBY plant-specific transcription factors.

    Proteins where this domain is known:
    MAL13P1.290    MAL8P1.72    PFL0145c    PFL0290w   


    SSF47113 - Histone-fold (Superfamily link)

    Interpro entry IPR009072 : Histone-fold (Interpro link)

    Interpro description:

    Histones mediate DNA organisation and plays a dominant role in regulating eukaryotic transcription. The histone-fold consists of a core of three helices, where the long middle helix is flanked at each end by shorter ones. Proteins displaying this structure include the nucleosome core histones, which form octomers composed of two copies of each of the four histones, H2A, H2B, H3 and H4; archaeal histone, which possesses only the core domain part of eukaryotic histone; and the TATA-box binding protein (TBP)-associated factors (TAF), where the histone fold is a common motif for mediating TAF-TAF interactions. TAF proteins include TAF(II)18 and TAF(II)28, which form a heterodimer, TAF(II)42 and TAF(II)62, which form a heterotetramer similar to (H3-H4)2, and the negative cofactor 2 (NC2) alpha and beta chains, which form a heterodimer. The TAF proteins are a component of transcription factor IID (TFIID), along with the TBP protein. TFIID forms part of the pre-initiation complex on core promoter elements required for RNA polymerase II-dependent transcription. The TAF subunits of TFIID mediate transcriptional activation of subsets of eukaryotic genes. The NC2 complex mediates the inhibition of TATA-dependent transcription through interactions with TBP.

    Proteins where this domain is known:
    PF07_0054    PF11_0061    PF11_0062    PF11_0477    PF13_0043    PF13_0185    PF14_0374    PFC0920w    PFF0510w    PFF0860c    PFF0865w   


    SSF47216 - Prot_act_regA (Superfamily link)

    Interpro entry IPR009077 : Proteasome activator pa28, REG alpha/beta subunit (Interpro link)

    Interpro description:

    The 20S proteasome is a multicatalytic complex that is responsible for the non-lysosomal degradation of intracellular proteins. The proteasome is composed of a catalytic core that is regulated by protein complexes, which bind to the ends of the cylindrical core structure. One of these regulatory complexes is the PA28 activator complex (also known as the 11S regulator, or REG), a ring-shaped hexameric structure that enhances the peptidase activity of the core enzyme. Three REG subunits have been isolated, REGalpha, REGbeta and REGgamma. REGalpha and REGbeta preferentially form a heteromeric complex with alternating alpha and beta subunits. The structure of the human REGalpha subunit reveals a heptameric barrel-shaped assembly containing a central channel. The binding of REG is thought to create a pore through with substrates and products can pass.

    Proteins where this domain is known:
    PFI0370c   


    SSF47240 - Ferritin/RR_like (Superfamily link)

    Interpro entry IPR009078 : Ferritin/ribonucleotide reductase-like (Interpro link)

    Interpro description:

    Ferritin is one of the major non-haem iron storage proteins in animals, plants, and microorganisms. Ferritin is a multisubunit protein with a hollow interior, which contains a mineral core of hydrated ferric oxide, thereby ensuring its solubility in an aqueous environment. Each subunit consists of a closed, four-helical bundle with a left-handed twist and one crossover connection.

    This family contains ferritin and other ferritin-like proteins such as bacterioferritin (cytochrome b1) that binds haem between two subunits, non-haem ferritin, dodecameric ferritin homologue (DPS) that binds to and protects DNA, and the N-terminal domain of rubrerythrin that is found in many air-sensitive bacteria and archaea. In addition, ribonucleotide reductase-like proteins show a similar structure to the ferritin-like fold; these di-iron carboxylate proteins constitute a diverse class of non-haem iron enzymes performing a multitude of redox reactions. This family includes the alpha and beta subunits of methane monooxygenase hydrolase, delta 9-stearoyl-acyl carrier protein desaturase and manganese catalase (T-catalase).

    Proteins where this domain is known:
    PF10_0154    PF14_0053   


    SSF47323 - tRNAsyn_1a_bind (Superfamily link)

    Interpro entry IPR009080 : Aminoacyl-tRNA synthetase, class 1a, anticodon-binding (Interpro link)

    Interpro description:

    The twenty aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA (tRNA) molecule in a highly specific two-step reaction. All of these proteins fall into one of two classes comprised of ten enzymes each: class 1 (Arg, Cys, Glu, Gln, Ile, Leu, Met, Tyr, Trp and Val) and class 2 (Ala, Asn, Gly, His, Lys, Phe, Pro, Ser, and Thr). Class 1 enzymes are mostly monomeric, and contain a characteristic Rossman binding fold that bind the tRNA acceptor stem from the minor groove side, using two highly conserved sequences. In contrast, class 2 enzymes share an anti-parallel beta-sheet formation that binds to the major groove side of the acceptor stem. Based on their mode of binding to the tRNA acceptor stem, both classes of tRNA synthetases have been subdivided into three subclasses, designated 1a, 1b, 1c and 2a, 2b, 2c. Class 1a (Arg, Cys, Ile, Leu, Met, Val) possess an RNA-binding domain with an alpha-helix-bundle fold; the binding of the anticodon of tRNA to the RNA-binding domain induces a conformation change in the catalytic domain of the enzyme.

    Proteins where this domain is known:
    PF08_0011    PF10_0053    PF10_0149    PF10_0340    PF13_0179    PF14_0589    PFC0470w    PFF1095w    PFL0900c    PFL1210w   


    SSF47336 - ACP_like (Superfamily link)

    Interpro entry IPR009081 : Acyl carrier protein-like (Interpro link)

    Interpro description:

    Acyl carrier protein (ACP) is an essential cofactor in the synthesis of fatty acids by the fatty acid synthetases systems in bacteria and plants. In addition to fatty acid synthesis, ACP is also involved in many other reactions that require acyl transfer steps, such as the synthesis of polyketide antibiotics, biotin precursor, membrane-derived oligosaccharides, and activation of toxins, and functions as an essential cofactor in lipoylation of pyruvate and alpha-ketoglutarate dehydrogenase complexes. Phosphopantetheine (or pantetheine 4' phosphate) is the prosthetic group of acyl carrier proteins (ACP) in some multienzyme complexes where it serves as a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. Phosphopantetheine is attached to a serine residue in these proteins. The core structure of ACP consists of a four-helical bundle, where helix three is shorter than the others.

    Several other proteins share structural homology with ACP, such as the bacterial apo-D-alanyl carrier protein, which facilitates the incorporation of D-alanine into lipoteichoic acid by a ligase, necessary for the growth and development of Gram-positive organisms; and the thioester domain of the bacterial peptide carrier protein (PCP) found within large modular non-ribosomal peptide synthetases, which are responsible for the synthesis of a variety of microbial bioactive peptides.

    Proteins where this domain is known:
    PFB0385w    PFL0415w   


    SSF47364 - SRP54 (Superfamily link)

    Interpro entry IPR013822 : Signal recognition particle, SRP54 subunit, helical bundle (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.

    These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.

    Proteins where this domain is known:
    PF10_0374    PF13_0350    PF14_0477   


    SSF47370 - Bromodomain (Superfamily link)

    Interpro entry IPR001487 : (Interpro link)

    Interpro description:
    Bromodomains are found in a variety of mammalian, invertebrate and yeast DNA-binding proteins. Bromodomains can interact with acetylated lysine. In some proteins, the classical bromodomain has diverged to such an extent that parts of the region are either missing or contain an insertion (e.g., mammalian protein HRX, Caenorhabditis elegans hypothetical protein ZK783.4, yeast protein YTA7). The bromodomain may occur as a single copy, or in duplicate.

    The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation.

    Proteins where this domain is known:
    PF08_0034    PF10_0328    PF14_0724    PFA0510w    PFF1440w    PFL0635c    PFL1645w   


    SSF47391 - cAMP-dep_prot_kin_reg_I/II_a/b (Superfamily link)

    Interpro entry IPR003117 : cAMP-dependent protein kinase, regulatory subunit, type I/II alpha/beta (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    In the absence of cAMP, Protein Kinase A (PKA) exists as an equimolar tetramer of regulatory (R) and catalytic (C) subunits. In addition to its role as an inhibitor of the C subunit, the R subunit anchors the holoenzyme to specific intracellular locations and prevents the C subunit from entering the nucleus. All R subunits have a conserved domain structure consisting of the N-terminal dimerization domain, inhibitory region, cAMP-binding domain A and cAMP-binding domain B. R subunits interact with C subunits primarily through the inhibitory site. The cAMP-binding domains show extensive sequence similarity and bind cAMP cooperatively.

    Two types of regulatory (R) subunits exist - types I and I - which differ in molecular weight, sequence, autophosphorylation cabaility, cellular location and tissue distribution. Types I and II were further sub-divided into alpha and beta subtypes, based mainly on sequence similarity. This entry represents types I-alpha, I-beta, II-alpha and II-beta regulatory subunits of PKA proteins. These subunits contain the dimerisation interface and binding site for A-kinase-anchoring proteins (AKAPs).

    Proteins where this domain is known:
    PF11_0050   


    SSF47413 - Lambda_like_DNA (Superfamily link)

    Interpro entry IPR010982 : Lambda repressor-like, DNA-binding (Interpro link)

    Interpro description:

    Bacteriophage lambda C1 repressor controls the expression of viral genes as part of the lysogeny/lytic growth switch. C1 is essential for maintaining lysogeny, where the phage replicates non-disruptively along with the host. If the host cell is threatened, then lytic growth is induced. The Lambda C1 repressor consists of two domains connected by a linker: an N-terminal DNA-binding domain that also mediates interactions with RNA polymerase, and a C-terminal dimerisation domain. The DNA-binding domain consists of four helices in a closed folded leaf motif. Several different phage repressors from different helix-turn-helix families contain DNA-binding domains that adopt a similar topology. These include the Lambda Cro repressor, Bacteriophage 434 C1 and Cro repressors, P22 C2 repressor, and Bacteriophage Mu Ner protein.

    The DNA-binding domain of Bacillus subtilis spore inhibition repressor SinR is identical to that of phage repressors. SinR represses sporulation, which only occurs in response to adverse conditions. This provides a possible evolutionary link between the two adaptive responses of bacterial sporulation and prophage induction.

    Other DNA-binding domains also display similar structural folds to that of Lambda C1. These include bacterial regulators such as the purine repressor (PurR), the lactose repressor (Lacr) and the fructose repressor (FruR), each of which has an N-terminal DNA-binding domain that exhibits a fold similar to that of lambda C1, except that they lack the first helix. POU-specific domains found in transcription factors such as in Oct-1, Pit-1 and Hepatocyte nuclear factor 1a (LFB1/HNF1) display four-helical fold DNA-binding domains similar to that of Lambda C1. The N-terminal domain of cyanase has an alpha-helix bundle motif similar to Lambda C1, but it probably does not bind DNA. Cyanase is an enzyme found in bacteria and plants that catalyses the reaction of cyanate with bicarbonate to produce ammonia and carbon dioxide in response to extracellular cyanate.

    Proteins where this domain is known:
    PF11_0293   


    SSF47446 - Signal_recog_particle_SRP54_M (Superfamily link)

    Interpro entry IPR004125 : Signal recognition particle, SRP54 subunit, M-domain (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents the M domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.

    These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homolog of ftsY; and bacterial flagellar biosynthesis protein flhF.

    Proteins where this domain is known:
    PF14_0477   


    SSF47473 - EF-hand (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.156    MAL13P1.267    MAL13P1.4    MAL13P1.515    MAL7P1.10    MAL7P1.154a    MAL7P1.217    MAL7P1.57    MAL7P1.69    MAL8P1.219    MAL8P1.79    PF07_0003    PF07_0072    PF07_0132    PF07_0134    PF08_0105    PF10_0004    PF10_0005    PF10_0145    PF10_0170    PF10_0177    PF10_0244    PF10_0271    PF10_0301    PF10_0393    PF10_0396    PF10_0400    PF10_0405    PF11_0009    PF11_0010    PF11_0022    PF11_0066    PF11_0098    PF11_0239    PF11_0242    PF11_0389    PF11_0519    PF13_0004    PF13_0211    PF13_0302    PF14_0002    PF14_0004    PF14_0006    PF14_0181    PF14_0224    PF14_0323    PF14_0420    PF14_0443    PF14_0492    PF14_0607    PF14_0769    PF14_0772    PFA0010c    PFA0020w    PFA0080c    PFA0305c    PFA0345w    PFA0515w    PFA0740w    PFB0030c    PFB0040c    PFB0815w    PFB1000w    PFB1010w    PFB1050w    PFC0010c    PFC0040w    PFC0190c    PFC0420w    PFC1095w    PFD0135c    PFD0692c    PFD1240w    PFE0020c    PFF0015c    PFF0035c    PFF0265c    PFF0520w    PFF1320c    PFF1540w    PFF1555w    PFF1560c    PFF1565c    PFF1575w    PFF1590w    PFI0010c    PFI0020w    PFI0055c    PFI0065w    PFI0070w    PFI1815c    PFL0025c    PFL2225w    PFL2615w    PFL2645c    PFL2660w   


    SSF47576 - Calponin-homology (Superfamily link)

    Interpro entry IPR016146 : (Interpro link)

    Interpro description:

    This entry represents the calponin-homology (CH) domain, a superfamily of actin-binding domains found in cytoskeletal proteins (contain two CH domain in tandem repeat), in regulatory proteins from muscle, and in signal transduction proteins. This domain has a core structure consisting of a 4-helical bundle. This domain is found in:

    Proteins where this domain is known:
    MAL8P1.136    PF14_0454    PFC0305w    PFI1450c   


    SSF47592 - MDM2 (Superfamily link)

    Interpro entry IPR003121 : (Interpro link)

    Interpro description:

    The SWI/SNF family of complexes, which are conserved from yeast to humans, are ATP-dependent chromatin-remodelling proteins that facilitate transcription activation. The mammalian complexes are made up of 9-12 proteins called BAFs (BRG1-associated factors). The BAF60 family have at least three members: BAF60a, which is ubiquitous, BAF60b and BAF60c, which are expressed in muscle and pancreatic tissues, respectively. BAF60b is present in alternative forms of the SWI/SNF complex, including complex B (SWIB), which lacks BAF60a. The SWIB domain is a conserved region found within the BAF60b proteins, and can be found fused to the C-terminus of DNA topoisomerase in Chlamydia.

    MDM2 is an oncoprotein that acts as a cellular inhibitor of the p53 tumour suppressor by binding to the transactivation domain of p53 and suppressing its ability to activate transcription. p53 acts in response to DNA damage, inducing cell cycle arrest and apoptosis. Inactivation of p53 is a common occurrence in neoplastic transformations. The core of MDM2 folds into an open bundle of four helices, which is capped by two small 3-stranded beta-sheets. It consists of a duplication of two structural repeats. MDM2 has a deep hydrophobic cleft on which the p53 alpha-helix binds; p53 residues involved in transactivation are buried deep within the cleft of MDM2, thereby concealing the p53 transactivation domain.

    The SWIB and MDM2 domains are homologous and share a common fold.

    Proteins where this domain is known:
    PFF0560c   


    SSF47616 - GST_C_like (Superfamily link)

    Interpro entry IPR010987 : (Interpro link)

    Interpro description:

    In eukaryotes, glutathione S-transferases (GSTs) participate in the detoxification of reactive electrophilic compounds by catalysing their conjugation to glutathione. GST is found as a domain in S-crystallins from squid, and proteins with no known GST activity, such as eukaryotic elongation factors 1-gamma and the HSP26 family of stress-related proteins, which include auxin-regulated proteins in plants and stringent starvation proteins in Escherichia coli. The major lens polypeptide of cephalopods is also a GST. Bacterial GSTs of known function often have a specific, growth-supporting role in biodegradative metabolism: epoxide ring opening and tetrachlorohydroquinone reductive dehalogenation are two examples of the reactions catalysed by these bacterial GSTs. Some regulatory proteins, like the stringent starvation proteins, also belong to the GST family. GST seems to be absent from Archaea in which gamma-glutamylcysteine substitute to glutathione as major thiol.

    Glutathione S-transferases form homodimers, but in eukaryotes can also form heterodimers of the A1 and A2 or YC1 and YC2 subunits. The homodimeric enzymes display a conserved structural fold. Each monomer is composed of a distinct N-terminal sub-domain, which adopts the thioredoxin fold, and a C-terminal all-helical sub-domain, which adopts a 4-helical bundle fold. This entry is the C-terminal domain.

    Glutaredoxin 2 (Grx2), glutathione-dependent disulphide oxidoreductases, is structurally similar to GSTs, even though they lack any sequence similarity. Grx2 is also composed of N and C terminal subdomains. It is thought that the primary function of Grx2 is to catalyse reversible glutathionylation of proteins with glutathione in cellular redox regulation including the response to oxidative stress. Grx2 is dissimilar to other glutaredoxins apart from containing the conserved active site residues.

    Proteins where this domain is known:
    PF13_0214    PF14_0187   


    SSF47661 - t-snare (Superfamily link)

    Interpro entry IPR010989 : t-SNARE (Interpro link)

    Interpro description:

    Soluble N-ethylmaleimide attachment protein receptor (SNARE) proteins are a family of membrane-associated proteins characterised by an alpha-helical coiled-coil domain called the SNARE motif. These proteins are classified as v-SNAREs and t-SNAREs based on their localisation on vesicle or target membrane; another classification scheme defines R-SNAREs and Q-SNAREs, as based on the conserved arginine or glutamine residue in the centre of the SNARE motif. SNAREs are localised to distinct membrane compartments of the secretory and endocytic trafficking pathways, and contribute to the specificity of intracellular membrane fusion processes.

    The t-SNARE domain consists of a 4-helical bundle with a coiled-coil twist. The SNARE motif contributes to the fusion of two membranes. SNARE motifs fall into four classes: homologues of syntaxin 1a (t-SNARE), VAMP-2 (v-SNARE), and the N- and C-terminal SNARE motifs of SNAP-25. It is thought that one member from each class interacts to form a SNARE complex.

    The SNARE motif represented in this entry is found in the N-terminal domains of certain syntaxin family members: syntaxin 1a, which is required for neurotransmitter releas, syntaxin 6, which is found in endosomal transport vesicles, yeast Sso1p, and Vam3p, a yeast syntaxin essential for vacuolar fusion. The SNARE motifs in these proteins share structural similarity, despite having a low level of sequence similarity.

    Proteins where this domain is known:
    MAL13P1.169    PF11_0052    PF14_0300    PF14_0500    PFB0480w    PFL0505c    PFL2070w   


    SSF47676 - TFIIS_conserved (Superfamily link)

    Interpro entry IPR010990 : Transcription elongation factor, TFIIS/elongin A/CRSP70, N-terminal (Interpro link)

    Interpro description:

    Transcription factor S-II (TFIIS) is a eukaryotic protein which induces mRNA cleavage by enhancing the intrinsic nuclease activity of RNA polymerase (Pol) II, past template-encoded pause sites. TFIIS shows DNA-binding activity only in the presence of RNA polymerase II. It is widely distributed being found in mammals, Drosophila, yeast and in the archaebacteria Sulfolobus acidocaldarius. S-II proteins have a relatively conserved C-terminal region but variable N-terminal region, and some members of this family are expressed in a tissue-specific manner.

    TFIIS is a modular factor that comprises an N-terminal domain I, a central domain II, and a C-terminal domain III. The weakly conserved domain I forms a four-helix bundle and is not required for TFIIS activity. Domain II forms a three-helix bundle, and domain III adopts a zinc-ribbon fold with a thin protruding beta-hairpin. Domain II and the linker between domains II and III are required for Pol II binding, whereas domain III is essential for stimulation of RNA cleavage. TFIIS extends from the polymerase surface via a pore to the internal active site, spanning a distance of 100 Angstroms. Two essential and invariant acidic residues in a TFIIS loop complement the Pol II active site and could position a metal ion and a water molecule for hydrolytic RNA cleavage. TFIIS also induces extensive structural changes in Pol II that would realign nucleic acids in the active centre.

    This entry represents the conserved N-terminal domain found in the transcription elongation factors TFIIS, elongin A and CRSP70. The N-terminal domain in these transcription factors is conserved from yeast to man, and has a 4-helical bundle fold with a left-handed twist within a left-handed superhelix. Elongin A is a mammalian transcription elongation factor that forms the active subunit of the Elongin complex, which stimulates the rate of elongation by RNA polymerase II by suppressing the transient pausing of the polymerase at many sites along the DNA template. CRSP70 is an essential subunit of the CRSP complex, which is required for the activity of the enhancer-binding protein Sp1.

    Proteins where this domain is known:
    PF07_0057   


    SSF47694 - Cyt_c_oxidase_6B (Superfamily link)

    Interpro entry IPR003213 : Cytochrome c oxidase, subunit VIb (Interpro link)

    Interpro description:

    Cytochrome c oxidase is an oligomeric enzymatic complex that is a component of the respiratory chain complex and is involved in the transfer of electrons from cytochrome c to oxygen. In eukaryotes this enzyme complex is located in the mitochondrial inner membrane; in aerobic prokaryotes it is found in the plasma membrane.

    In eukaryotes, in addition to the three large subunits, I, II and III, that form the catalytic centre of the enzyme complex, there are a variable number of small polypeptide subunits. One of these subunits is the potentially haem-binding subunit, VIb, which is encoded in the nucleus.

    Proteins where this domain is known:
    PFI1375w   


    SSF47729 - IHF_like_DNA_bnd (Superfamily link)

    Interpro entry IPR010992 : IHF-like DNA-binding (Interpro link)

    Interpro description:

    Integration host factor (IHF) is a small heterodimeric protein that binds the minor groove of DNA in a sequence-specific manner and induces a large bend. This bending stabilises distinct DNA conformations that are required during several bacterial processes, such as recombination, transposition, replication and transcription. The core structure of IHF consists of a partly opened 4-helical bundle that is capped with a beta-sheet.

    Prokaryotic protein HU and the bacteriophage SPO1 transcription factor TF1 are closely related to IHF. These proteins are collectively referred to as type II DNA-binding proteins (DBPII), forming a group of basic, dimeric proteins found in all bacteria that are able to bind DNA to induce and stabilise DNA bending. HU plays a structural role in replication initiation, transcription regulation, site-specific recombination, and the compaction of the bacterial genome. TF1 is essential for viral multiplication.

    The DNA-binding domain of the TraM protein, an essential component of the DNA transfer machinery of the conjugative resistance plasmid R1, appears to have a similar structure to DBPII.

    Proteins where this domain is known:
    PF07_0042    PFI0230c   


    SSF47769 - SAM_homology (Superfamily link)

    Interpro entry IPR010993 : (Interpro link)

    Interpro description:

    Sterile alpha motif (SAM) domains are known to be involved in diverse protein-protein interactions, associating with both SAM-containing and non-SAM-containing proteins pathway. SAM domains exhibit a conserved structure, consisting of a 4-5-helical bundle of two orthogonally packed alpha-hairpins. However SAM domains display a diversity of function, being involved in interactions with proteins, DNA and RNA. The name sterile alpha motif arose from its presence in proteins that are essential for yeast sexual differentiation. The SAM domain has had various names, including SPM, PTN (pointed), SEP (yeast sterility, Ets-related, PcG proteins), NCR (N-terminal conserved region) and HLH (helix-loop-helix) domain, all of which are related and can be classified as SAM domains.

    SAM domains occur in eukaryotic and in some bacterial proteins. Structures have been determined for several proteins that contain SAM domains, including Ets-1 transcription factor, which plays a role in the development and invasion of tumour cells by regulating the expression of matrix-degrading proteases; Etv6 transcription factor, gene rearrangements of which have been demonstrated in several malignancies; EphA4 receptor tyrosine kinase, which is believed to be important for the correct localization of a motoneuron pool to a specific position in the spinal cord; EphB2 receptor, which is involved in spine morphogenesis via intersectin, Cdc42 and N-Wasp; p73, a p53 homologue involved in neuronal development; and polyhomeotic, which is a member of the Polycomb group of genes (Pc-G) required for the maintenance of the spatial expression pattern of homeotic genes.

    Proteins where this domain is known:
    PF11_0079    PF13_0258    PFB0520w    PFI1275w   


    SSF47781 - RuvA_2_like (Superfamily link)

    Interpro entry IPR010994 : (Interpro link)

    Interpro description:

    In prokaryotes, RuvA, RuvB, and RuvC process the universal DNA intermediate of homologous recombination, termed Holliday junction. The tetrameric DNA helicase RuvA specifically binds to the Holliday junction and facilitates the isomerization of the junction from the stacked folded configuration to the square-planar structure. In the RuvA tetramer, each subunit consists of three domains, I, II and III, where I and II form the major core that is responsible for Holliday junction binding and base pair rearrangements of Holliday junction executed at the crossover point, whereas domain III regulates branch migration through direct contact with RuvB. Domain 2 has a SAM (sterile alpha motif)-like alpha bundle fold that occurs as a duplication containing two helix-hairpin-helix (HhH) motifs.

    The C-terminal domain (CTD) of the excision repair protein UvrC shows structural similarity to RuvA domain 2. The CTD of UvrC is essential for 5' incision in the prokaryotic nucleotide excision repair process, and acts to mediate structure-specific binding to single-stranded-double-stranded junction DNA.

    Domain 3 of NAD+-dependent DNA ligase consists of a duplication of two RuvA-like domains (four HhH motifs), and also contains a zinc-finger subdomain. DNA ligases catalyze the crucial step of joining the breaks in duplex DNA during DNA replication, repair and recombination, utilizing either ATP or NAD+ as a cofactor.

    Proteins where this domain is known:
    PFB0160w   


    SSF47794 - Rad51_N (Superfamily link)

    Interpro entry IPR010995 : DNA repair Rad51/transcription factor NusA, alpha-helical (Interpro link)

    Interpro description:

    This entry represents an alpha-helical bundle domain, which has a SAM domain-like fold. This compact domain consists of a 4-5 helical bundle of two orthogonally packed alpha-hairpins, and contains one classic and one pseudo HhH (helix-hairpin-helix) motif. This domain is found at N-terminal of the DNA repair protein Rad1, at the C-terminal of the transcription elongation protein NusA, and at the C-terminal of the hypothetical protein AF1548.

    Human Rad51 protein is a homologue of Escherichia coli RecA protein, and functions in DNA repair and recombination. In higher eukaryotes, Rad51 protein is essential for cell viability. The N-terminal region of Rad51 is highly conserved among eukaryotic Rad51 proteins but is absent from RecA, suggesting a Rad51-specific function for this region. The-terminal domain is involved in interactions with DNA and proteins; DNA binding may be regulated via phosphorylation within the N-terminal domain.

    NusA (N utilisation substance A) from E. coli is an essential transcription factor that associates with the RNA polymerase (RNAP) core enzyme, where it modulates transcriptional pausing, termination and anti-termination. The C-terminal of NusA consists of two repeat units, and is responsible for the interaction of NisA with the C-terminal of RNAP, and with its interaction with protein N from phage lambda during anti-termination.

    Proteins where this domain is known:
    MAL8P1.76    PF11_0087   


    SSF47802 - DNApol_B_N_like (Superfamily link)

    Interpro entry IPR010996 : DNA-directed DNA polymerase, family X, beta-like, N-terminal (Interpro link)

    Interpro description:

    Mammalian DNA polymerase beta (polB) is a 39-kDa protein with both nucleotidyltransferase and 5'-deoxyribose phosphodiesterase activities, playing a role in both excision repair and meiosis. polB has a modular organisation with an 8-kDa N-terminal domain (NTD) connected to the 31-kDa C-terminal domain by a protease-hypersensitive hinge region. The NTD acts as a single-stranded DNA binding domain, interacting most efficiently with the 5'-phosphate of the downstream primer of the gapped DNA. This interaction is mediated by a helix-hairpin-helix motif (HhH), which is also found in several other DNA repair enzymes. The residue threonine 79 (T79), which is located within the NTD, was identified as being critical to polB function, even though it makes no contact with either DNA template or dNTP substrate; T79 is located between two HhH motifs, and acts as a hinge residue that is important for positioning the DNA within the active site.

    The catalytic core (residues 148-242) of murine terminal deoxynucleotidyl transferase (TdT) displays a structural fold that is similar to polB, and shares a common two-metal ion mechanism of nucleotidyl transfer with polB. TdT elongates DNA strands in a template-independent manner, and belongs to the pol X family of polymerases. TdT has only been found in vertebrates, where it is highly conserved. TdT brings additional diversity in the immune repertoire by adding nucleotides, called N regions, to the V(D)J recombination junction sites of immunoglobulin and T-cell receptor genes.

    Proteins where this domain is known:
    PF14_0470   


    SSF47807 - 5_3_exo_C (Superfamily link)

    Interpro entry IPR008918 : Helix-hairpin-helix motif, class 2 (Interpro link)

    Interpro description:

    The helix-hairpin-helix (HhH) motif is an around 20 amino acids domain present in prokaryotic and eukaryotic non-sequence-specific DNA binding proteins. The HhH motif is similar to, but distinct from, the helix-turn-helix (HtH) and the helix-loop-helix (HLH) motifs. All three motifs have two helices (H1 and H2) connected by a short turn. DNA-binding proteins with a HhH structural motif are involved in non-sequence-specific DNA binding that occurs via the formation of hydrogen bonds between protein backbone nitrogens and DNA phosphate groups. These HhH motifs are observed in DNA repair enzymes and in DNA polymerases. By contrast, proteins with a HtH motif bind DNA in a sequence-specific manner through the binding of H2 with the major groove; these proteins are primarily gene regulatory proteins. DNA-binding proteins with the HLH structural motif are transcriptional regulatory proteins and are principally related to a wide array of developmental processes.

    Examples of proteins that contain a HhH motif include the 5'-exonuclease domains of prokaryotic DNA polymerases, the eukaryotic/prokaryotic RAD2 family of 5'-3' exonucleases such as T4 RNase H and T5, eukaryotic 5' endonucleases such as FEN-1 (Flap), and some viral exonucleases.

    Proteins where this domain is known:
    PF07_0105    PF10_0080    PFB0180w    PFB0265c    PFD0420c   


    SSF47819 - HRDC_like (Superfamily link)

    Interpro entry IPR010997 : HRDC-like (Interpro link)

    Interpro description:

    The HRDC (helicase and RNaseD C-terminal) domain is comprised of two orthogonally packed alpha-hairpin subdomains, and is involved in interactions with DNA and protein.

    The HRDC (helicase and RNaseD C-terminal) domain is found at the C terminus of many RecQ helicases, including the human Werner and Bloom syndrome proteins. RecQ helicases have been shown to unwind DNA in an ATP-dependent manner. The structure of the HRDC domain consists of a 4-5 helical bundle of two orthogonally packed alpha-hairpins, and as such it resembles auxiliary domains in bacterial DNA helicases and other proteins that interact with nucleic acids. A positively charged region on the surface of the HRDC domain is able to interact with DNA.

    The HRDC domain is also present in eukaryotic and archaeal RNA polymerase II subunit RBP4, the N-terminal of which forms a heterodimerisation alpha-hairpin.

    Proteins where this domain is known:
    PFB0245c   


    SSF47917 - ATPase_a/b_C (Superfamily link)

    Interpro entry IPR000793 : ATPase, F1/V1/A1 complex, alpha/beta subunit, C-terminal (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    This entry represents the alpha and beta subunits found in the F1, V1, and A1 complexes of F-, V- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases). The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .

    In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.

    In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.

    The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain. This entry represents the C-terminal domain of the alpha/A/beta/B subunits, which forms a left-handed superhelix composed of 4-5 individual helices. The C-terminal domain can vary between the alpha and beta subunits, and between different ATPases .

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0065    PFB0795w    PFL1725w   


    SSF47923 - RabGAP_TBC (Superfamily link)

    Interpro entry IPR000195 : RabGAP/TBC (Interpro link)

    Interpro description:
    Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, imply that these domains are GTPase activator proteins of Rab-like small GTPases.

    Proteins where this domain is known:
    MAL13P1.244    MAL7P1.127    PF11_0151    PF13_0117    PF14_0048    PF14_0647    PF14_0699    PFC1030w    PFE0330w    PFI0195c    PFI0345w    PFL1445w   


    SSF47928 - ATPsynt_OSCP (Superfamily link)

    Interpro entry IPR000711 : ATPase, F1 complex, OSCP/delta subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family represents subunits called delta in bacterial and chloroplast ATPase, or OSCP (oligomycin sensitivity conferral protein) in mitochondrial ATPase (note that in mitochondria there is a different delta subunit). The OSCP/delta subunit appears to be part of the peripheral stalk that holds the F1 complex alpha3beta3 catalytic core stationary against the torque of the rotating central stalk, and links subunit A of the F0 complex with the F1 complex. In mitochondria, the peripheral stalk consists of OSCP, as well as F0 components F6, B and D. In bacteria and chloroplasts the peripheral stalks have different subunit compositions: delta and two copies of F0 component B (bacteria), or delta and F0 components B and BÂ (chloroplasts), .

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    MAL13P1.47   


    SSF47938 - Prp18 (Superfamily link)

    Interpro entry IPR004098 : Prp18 (Interpro link)

    Interpro description:

    The splicing factor Prp18 is required for the second step of pre-mRNA splicing. PRP18 appears to be primarily associated with the U5 snRNP.

    The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles.

    Proteins where this domain is known:
    PFI1115c   


    SSF47954 - Cyclin_like (Superfamily link)

    Interpro entry IPR011028 : (Interpro link)

    Interpro description:

    Cyclins are eukaryotic proteins that play an active role in controlling nuclear cell division cycles, and regulate cyclin dependent kinases (CDKs). Cyclins, together with the p34 (cdc2) or cdk2 kinases, form the Maturation Promoting Factor (MPF). There are two main groups of cyclins, G1/S cyclins, which are essential for the control of the cell cycle at the G1/S (start) transition, and G2/M cyclins, which are essential for the control of the cell cycle at the G2/M (mitosis) transition. G2/M cyclins accumulate steadily during G2 and are abruptly destroyed as cells exit from mitosis (at the end of the M-phase). In most species, there are multiple forms of G1 and G2 cyclins. For example, in vertebrates, there are two G2 cyclins, A and B, and at least three G1 cyclins, C, D, and E.

    Cyclin homologues have been found in various viruses, including Saimiriine herpesvirus 2 (Herpesvirus saimiri) and Human herpesvirus 8 (HHV-8) (Kaposi's sarcoma-associated herpesvirus). These viral homologues differ from their cellular counterparts in that the viral proteins have gained new functions and eliminated others to harness the cell and benefit the virus.

    This domain is also found as the core domain in transcription factor IIB (TFIIB) and in the retinoblastoma tumour suppressor.

    Proteins where this domain is known:
    MAL13P1.323    MAL8P1.152    PF13_0022    PF14_0469    PF14_0605    PFA0525w    PFE0920c    PFF0270c   


    SSF47973 - Ribosomal_S7 (Superfamily link)

    Interpro entry IPR000235 : Ribosomal protein S7 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S7 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S7 is known to bind directly to part of the 3'end of 16S ribosomal RNA. It belongs to a family of ribosomal proteins which have been grouped on the basis of sequence similarities. The structure for S7 is known.

    Proteins where this domain is known:
    PF07_0088   


    SSF48019 - Pol_clamp_load_C (Superfamily link)

    Interpro entry IPR008921 : DNA polymerase III clamp loader subunit, C-terminal (Interpro link)

    Interpro description:

    The Escherichia coli DNA polymerase III gamma complex clamp loader assembles the ring-shaped beta sliding clamp onto DNA. The core polymerase is tethered to the template by beta, enabling progressive replication of the genome. The E. coli complex clamp loader contains five different subunits, clamp loading only requires 3 of these - the gamma, delta, delta' complex. Three gamma subunits, and one each of delta and delta', are arranged in a circle. Each subunit adopts the same chain topology, and folds into three domains. However, the relative orientation of these domains is different for each subunit. The carboxy-terminal domains provide the major subunit contacts of the pentamer, although other intersubunit contacts are present. The amino-terminal domains do not form a continuous circle. These domains are arranged in a highly asymmetric fashion, and appear to dangle under the carboxy-terminal pentamer 'umbrella'.

    Proteins where this domain is known:
    PF14_0601    PFB0840w    PFB0895c    PFL2005w   


    SSF48108 - CarbamoylP_synth_lsu_oligo (Superfamily link)

    Interpro entry IPR005480 : Carbamoyl phosphate synthetase, large subunit, oligomerisation (Interpro link)

    Interpro description:

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.

    This entry represents the oligomerisation domain found in the large subunit of carbamoyl phosphate synthases as well as in certain other carboxy phsophate domain-containing enzymes.

    Proteins where this domain is known:
    PF13_0044   


    SSF48140 - Ribosomal_L19/L19e (Superfamily link)

    Interpro entry IPR000196 : Ribosomal protein L19/L19e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents the ribosomal protein L19 from eukaryotes, as well as L19e from archaea. L19/L19e is absent in bacteria. L19/L19e is part of the large ribosomal subunit, whose structure has been determined in a number of eukaryotic and archaeal species. L19/L19e is a multi-helical protein consisting of two different 3-helical domains connected by a long, partly helical linker.

    Proteins where this domain is known:
    PFF0700c   


    SSF48150 - DNA_glycsylse (Superfamily link)

    Interpro entry IPR011257 : DNA glycosylase (Interpro link)

    Interpro description:

    DNA glycosylases act to repair oxidative damage in DNA. These proteins are redundant as there are several different types of DNA glycosylases that are ale to compensate for one another. Examples include the endonuclease III subfamily, the mismatch glycosylases subfamily, the 3-methyladenine DNA glycosylases I subfamily, and the DNA repair glycosylases subfamily.

    Proteins where this domain is known:
    PF11_0306    PFF0715c    PFI0835c   


    SSF48163 - tRNA-synt_bind (Superfamily link)

    Interpro entry IPR008925 : Aminoacyl-tRNA synthetase, class I, anticodon-binding (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    Structurally, an alpha-helix-bundle anticodon-binding domain characterises the class Ia synthetases, whereas the class Ib synthetases, GlnRS and GluRS have distinct anticodon-binding domains. The Rossmann-fold and anticodon-binding domains are connected by a beta-alpha-alpha-beta-alpha topology ('SC fold') domain that contains the class I specific KMSKS motif.

    Proteins where this domain is known:
    MAL13P1.281   


    SSF48168 - Ribonucleo_red_N (Superfamily link)

    Interpro entry IPR008926 : Ribonucleotide reductase R1 subunit, N-terminal (Interpro link)

    Interpro description:

    The large subunit (R1) of ribonucleotide reductase (RNR), is an essential enzyme required for DNA replication and DNA repair. In both Escherichia coli and higher organisms, the enzyme consists of two non-identical subunits, a dimer of an 85-kDa protein, R1, and a dimer of a 45-kDa protein, R2. Both subunits are essential for RNR enzyme activity - R1 contains, in the substrate binding site, the reducing active cysteine pair and R2 provides a catalytically essential organic radical. R1 is able to bind and reduce the four common ribonucleoside diphosphates. Substrate specificity is determined by nucleoside triphosphates binding to a protein site different from the active site and acting as allosteric effectors. Thus the presence of ATP makes the enzyme reduce CDP and UDP, dGTP favours ADP reduction and dTTP favours GDP reduction. dATP is a general inhibitor. This provides a mechanism for a balanced enzymatic production of building blocks for DNA synthesis.

    Proteins where this domain is known:
    PF14_0352   


    SSF48173 - Photolyase_FAD-bd/Cryptochr_C (Superfamily link)

    Interpro entry IPR005101 : DNA photolyase, FAD-binding/Cryptochrome, C-terminal (Interpro link)

    Interpro description:

    This entry represents a multi-helical domain composed of two all-alpha subdomains that is found as the C-terminal domain in cryptochrome proteins, as well as at the N-terminal of DNA photolyase where it acts as a FAD-binding domain (the N-terminal of DNA photolyase binds a light-harvesting cofactor).

    Photolyases and cryptochromes are related flavoproteins that bind FAD. Photolyases harness the energy of blue light to repair DNA damage by removing pyrimidine dimers. Cryptochromes (CRY1 and CRY2) are blue light photoreceptors that mediate blue light-induced gene expression.

    DNA photolyases are DNA repair enzymes that repair mismatched pyrimidine dimers induced by exposure to ultra-violet light. They bind to UV-damaged DNA containing pyrimidine dimers and, upon absorbing a near-UV photon (300 to 500 nm), they catalyse dimer splitting, breaking the cyclobutane ring joining the two pyrimidines of the dimer so as to split them into the constituent monomers; this process is called photoreactivation. DNA photolyases require two choromophore-cofactors for their activity. All monomers contain a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm.

    Proteins where this domain is known:
    PFE0675c   


    SSF48179 - 6DGDH_C_like (Superfamily link)

    Interpro entry IPR008927 : 6-phosphogluconate dehydrogenase, C-terminal-like (Interpro link)

    Interpro description:

    6-phosphogluconate dehydrogenase catalyses the oxidative decarboxylation of 6-phosphogluconate to ribulose 5-phosphate with the concomitant reduction of NADP to NADPH. The metazoan 6PGDHs have a well-conserved glycine-serine rich sequence at the C-terminus, which is lacking from bacterial enzymes and from those of the parasitic protozoan Trypanosoma brucei. The active dimer of the mammalian enzyme assembles with the C-terminal tail of one subunit threaded through the other, forming part of the substrate-binding site. The tail of T. brucei 6PGDH is shorter than that of the mammalian enzyme and its terminal residues associate tightly with the second monomer. The three-dimensional structure shows this generates additional interactions between the subunits close to the active site; the coenzyme-binding domain is thereby associated more tightly with the helical domain. Three residues, conserved in all other known sequences, are important in creating a salt bridge between monomers close to the substrate-binding site.

    This domain is structurally similar to domains found in several different families, including those represented by mannitol 2-dehydrogenase, acetohydroxy acid isomeroreductase, short chain L-3-hydroxyacyl CoA dehydrogenase, UDP-glucose/GDP-mannose dehydrogenase (dimerisation domain), N-(1-D-carboxylethyl)-L-norvaline dehydrogenase, glycerol-3-phosphate dehydrogenase, and ketopantoate reductase (PanE).

    Proteins where this domain is known:
    PF11_0157    PF14_0520    PFL0780w   


    SSF48239 - Terp_cyc_toroid (Superfamily link)

    Interpro entry IPR008930 : (Interpro link)

    Interpro description:

    Protein prenyltransferases catalyze the transfer of the carbon moiety of C15 farnesyl pyrophosphate or geranylgeranyl pyrophosphate synthase to a conserved cysteine residue in a CaaX motif of protein and peptide substrates. The addition of a farnesyl group is required to anchor proteins to the cell membrane. In the 3D structure of a mammalian Ras farnesyltransferases (Ftase), both subunits are largely composed of alpha-helices. The alpha-2 to alpha-15 helices in the alpha subunit fold into a novel helical hairpin structure, resulting in a crescent-shape domain that envelopes part of the subunit. The 12 helices of the beta-subunit form an alpha-alpha barrel. Six additional helices connect the inner core of helices and form the outside of the helical barrel. A deep cleft surrounded by hydrophobic amino acids in the centre of the barrel is proposed as the FPP-binding pocket. A single Zn2+ ion is located at the junction between the hydrophilic surface groove near the subunit interface

    Terpenoid cyclases such as squalene cyclase, pentalenene synthase, 5-epi-aristolochene synthase, and trichodiene synthase are responsible for the synthesis of cholesterol, a hydrocarbon precursor of the pentalenolactone family of antibiotics, a precursor of the antifungal phytoalexin capsidiol, and the precursor of antibiotics and mycotoxins, respectively. In the structures of these three enzymes, the similar structural feature referred to as 'terpenoid synthase fold' with 10-12 mostly antiparallel alpha-helices is found, as also observed in protein prenyltransferases. The high structural similarity provides support for the hypothesis that the three families of prenyltransferases have related evolution despite their low sequence similarity.

    Proteins where this domain is known:
    PF11_0483    PFF0120w    PFL0695c   


    SSF48256 - Citrate_synthase_core (Superfamily link)

    Interpro entry IPR016141 : Citrate synthase-like, core (Interpro link)

    Interpro description:

    Citrate synthaseis a member of a small family of enzymes that can directly form a carbon-carbon bond without the presence of metal ion cofactors. It catalyses the first reaction in the Krebs' cycle, namely the conversion of oxaloacetate and acetyl-coenzyme A into citrate and coenzyme A. This reaction is important for energy generation and for carbon assimilation. The reaction proceeds via a non-covalently bound citryl-coenzyme A intermediate in a 2-step process (aldol-Claisen condensation followed by the hydrolysis of citryl-CoA).

    Citrate synthase enzymes are found in two distinct structural types: type I enzymes (found in eukaryotes, Gram-positive bacteria and archaea) form homodimers and have shorter sequences than type II enzymes, which are found in Gram-negative bacteria and are hexameric in structure. In both types, the monomer is composed of two domains: a large alpha-helical domain consisting of two structural repeats, where the second repeat is interrupted by a small alpha-helical domain. The cleft between these domains forms the active site, where both citrate and acetyl-coenzyme A bind. The enzyme undergoes a conformational change upon binding of the oxaloacetate ligand, whereby the active site cleft closes over in order to form the acetyl-CoA binding site. The energy required for domain closure comes from the interaction of the enzyme with the substrate. Type II enzymes possess an extra N-terminal beta-sheet domain, and some type II enzymes are allosterically inhibited by NADH.

    This entry represents the core of type I and II citrate synthase enzymes, comprising both the large and small alpha-helical domains. In addition, this entry represents the related enzymes 2-methylcitrate synthase and ATP citrate synthase. 2-methylcitrate synthase catalyses the conversion of oxaloacetate and propanoyl-CoA into (2R,3S)-2-hydroxybutane-1,2,3-tricarboxylate and coenzyme A. This enzyme is induced during bacterial growth on propionate, while type II hexameric citrate synthase is constitutive. ATP citrate synthase (also known as ATP citrate lyase) catalyses the MgATP-dependent, CoA-dependent cleavage of citrate into oxaloacetate and acetyl-CoA, a key step in the reductive tricarboxylic acid pathway of CO2 assimilation used by a variety of autotrophic bacteria and archaea to fix carbon dioxide. ATP citrate synthase is composed of two distinct subunits. In eukaryotes, ATP citrate synthase is a homotetramer of a single large polypeptide, and is used to produce cytosolic acetyl-CoA from mitochondrial-produced citrate.

    Proteins where this domain is known:
    PF10_0218    PFF0455w   


    SSF48300 - Ribosomal_L12/7 (Superfamily link)

    Interpro entry IPR008932 : Ribosomal protein L7/L12, oligomerisation (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L7/12 consists of two domains that are connected by a flexible region. The N-terminal domain is required for dimer formation and for anchoring the protein to the ribosome by binding to ribosomal protein L10, while the C-terminal domain is required for translation factors binding.

    Proteins where this domain is known:
    PFE1225w   


    SSF48317 - AcPase_VanPerase (Superfamily link)

    Interpro entry IPR000326 : Phosphatidic acid phosphatase type 2/haloperoxidase (Interpro link)

    Interpro description:

    This entry represents type 2 phosphatidic acid phosphatase (PAP2; enzymes, such as phosphatidylglycerophosphatase Bfrom Escherichia coli. PAP2 enzymes have a core structure consisting of a 5-helical bundle, where the beginning of the third helix binds the cofactor. PAP2 enzymes catalyse the dephosphorylation of phosphatidate, yielding diacylglycerol and inorganic phosphate. In eukaryotic cells, PAP activity has a central role in the synthesis of phospholipids and triacylglycerol through its product diacylglycerol, and it also generates and/or degrades lipid-signalling molecules that are related to phosphatidate.

    Other related enzymes have a similar core structure, including haloperoxidases such as bromoperoxidase (contains one core bundle, but forms a dimer), chloroperoxidases (contains two core bundles arranged as in other family dimers), bacitracin transport permease from Bacillus licheniformis, glucose-6-phosphatase from rat. The vanadium-dependent haloperoxidases exclusively catalyse the oxidation of halides, and act as histidine phosphatases, using histidine for the nucleophilic attack in the first step of the reaction. Amino acid residues involved in binding phosphate/vanadate are conserved between the two families, supporting a proposal that vanadium passes through a tetrahedral intermediate during the reaction mechanism.

    Proteins where this domain is known:
    MAL8P1.202   


    SSF48334 - DNA_repair_MutS_domIII (Superfamily link)

    Interpro entry IPR007696 : DNA mismatch repair protein MutS, core (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the core domain (domain 3) found in proteins of the MutS family. The core domain of MutS adopts a multi-helical structure comprised of two subdomains, which are interrupted by the clamp domain. Two of the helices in the core domain comprise the levers that extend towards the DNA.

    Proteins where this domain is known:
    MAL7P1.206    PF14_0254    PFE0270c   


    SSF48350 - Rho_GAP (Superfamily link)

    Interpro entry IPR008936 : Rho GTPase activation protein (Interpro link)

    Interpro description:

    Proteins containing a RhoGAP (Rho GTPase Activating Protein) domain usually function to catalyze the hydrolysis of GTP that is bound to Rho, Rac and/or Cdc42, inactivating these regulators of the actin cytoskeleton. The 53 known human RhoGAP domain-containing proteins are the largest known group of Rho GTPase regulators and significantly outnumber the 21 Rho GTPases they presumably regulate. This excess of GAP proteins probably indicates complex regulation of the Rho GTPases and is consistent with the existence of almost as many (48) human Dbl domain-containing Rho GEFs that act antagonistically to the RhoGAP proteins by activating the Rho GTPases. Phylogenetic analysis offers evidence for frequent domain duplication and for duplication of the entire genes containing these GAP domains.

    Proteins where this domain is known:
    PF10_0071   


    SSF48371 - ARM-type_fold (Superfamily link)

    Interpro entry IPR016024 : Armadillo-type fold (Interpro link)

    Interpro description:

    This entry represents a structural domain with an armadillo (ARM)-like fold, consisting of a multi-helical fold comprised of two curved layers of alpha helices arranged in a regular right-handed superhelix, where the repeats that make up this structure are arranged about a common axis. These superhelical structures present an extensive solvent-accessible surface that is well suited to binding large substrates such as proteins and nucleic acids. Domains and repeats with an ARM-like fold have been found in a number of proteins, including:

    The sequence similarity among these different repeats or domains is low, however they exhibit considerable structural similarity. Furthermore, the number of repeats present in the superhelical structure can vary between orthologues, indicating that rapid loss/gain of repeats has occurred frequently in evolution. A common phylogenetic origin has been proposed for the armadillo and HEAT repeats.

    Proteins where this domain is known:
    MAL13P1.105    MAL13P1.123    MAL13P1.26    MAL13P1.308    MAL13P1.352    MAL13P1.63    MAL13P1.83    MAL7P1.164    MAL7P1.19    MAL7P1.202    MAL8P1.123    MAL8P1.42    PF08_0069    PF08_0087    PF08_0089    PF10_0335    PF11_0086    PF11_0318    PF11_0368    PF11_0397    PF11_0463    PF11_0527    PF13_0013    PF13_0034    PF13_0352    PF14_0031    PF14_0113    PF14_0196    PF14_0216    PF14_0239    PF14_0277    PF14_0304    PF14_0468    PF14_0529    PF14_0540    PF14_0632    PFB0260w    PFC0135c    PFC0245c    PFC0375c    PFC0441c    PFD0525w    PFD0720w    PFD0825c    PFE0100w    PFE0170c    PFE0375w    PFE0485w    PFE0765w    PFE0935c    PFE1195w    PFE1400c    PFF0655c    PFF0830w    PFF1030w    PFF1345w    PFI0200c    PFI0830c    PFI1265w    PFI1590c    PFL0405w    PFL0675c    PFL0930w    PFL1240c    PFL1510c    PFL1855w   


    SSF48403 - ANK (Superfamily link)

    Interpro entry IPR002110 : (Interpro link)

    Interpro description:

    The ankyrin repeat is one of the most common protein-protein interaction motifs in nature. Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a large number of functionally diverse proteins mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell-cycle regulators, cytoskeletal, ion transporters and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function since there is no specific sequence or structure which is universally recognised by it.

    The conserved fold of the ankyrin repeat unit is known from several crystal and solution structures. Each repeat folds into a helix-loop-helix structure with a beta-hairpin/loop region projecting out from the helices at a 90o angle. The repeats stack together to form an L-shaped structure.

    Proteins where this domain is known:
    MAL13P1.126    MAL13P1.71    MAL8P1.28    PF10_0102    PF10_0213    PF10_0328    PF11_0197    PF11_0439    PF14_0106    PF14_0222    PF14_0690    PFC0160w    PFE0400w    PFF1315w    PFF1365c    PFL2200w   


    SSF48425 - Sec7 (Superfamily link)

    Interpro entry IPR000904 : SEC7-like (Interpro link)

    Interpro description:
    The SEC7 domain was named after the first protein found to contain such a region. It has been shown to be linked with guanine nucleotide exchange function. The 3D structure of the domain displays several alpha-helices. It was found to be associated with other domains involved in guanine nucleotide exchange (e.g., CDC25, Dbl) in mammalian factors.

    Proteins where this domain is known:
    PF14_0407   


    SSF48439 - Prenyl_trans (Superfamily link)

    Proteins where this domain is known:
    MAL8P1.60    PF11_0108    PF14_0031    PF14_0403    PFE1320w    PFL2050w   


    SSF48445 - 14-3-3 (Superfamily link)

    Interpro entry IPR000308 : 14-3-3 protein (Interpro link)

    Interpro description:

    The 14-3-3 proteins are a large family of approximately 30kDa acidic proteins which exist primarily as homo- and heterodimeric within all eukaryotic cells. There is a high degree of sequence identity and conservation between all the 14-3-3 isotypes, particularly in the regions which form the dimer interface or line the central ligand binding channel of the dimeric molecule. Each 14-3-3 protein sequence can be roughly divided into three sections: a divergent amino terminus, the conserved core region and a divergent carboxyl terminus. The conserved middle core region of the 14-3-3s encodes an amphipathic groove that forms the main functional domain, a cradle for interacting with client proteins. The monomer consists of nine helices organised in an antiparallel manner, forming an L-shaped structure. The interior of the L-structure is composed of four helices: H3 and H5, which contain many charged and polar amino acids, and H7 and H9, which contain hydrophobic amino acids. These four helices form the concave amphipathic groove that interacts with target peptides.

    14-3-3 proteins mainly bind proteins containing phosphothreonine or phosphoserine motifs however exceptions to this rule do exist. Extensive investigation of the 14-3-3 binding site of the mammalian serine/threonine kinase Raf-1 has produced a consensus sequence for 14-3-3-binding, RSxpSxP (in the single-letter amino-acid code, where x denotes any amino acid and p indicates that the next residue is phosphorylated). 14-3-3 proteins appear to effect intracellular signalling in one of three ways - by direct regulation of the catalytic activity of the bound protein, by regulating interactions between the bound protein and other molecules in the cell by sequestration or modification or by controlling the subcellular localisation of the bound ligand. Proteins appear to initially bind to a single dominant site and then subsequently to many, much weaker secondary interaction sites. The 14-3-3 dimer is capable of changing the conformation of its bound ligand whilst itself undergoing minimal structural alteration.

    Proteins where this domain is known:
    MAL13P1.309    MAL8P1.69    PF14_0220   


    SSF48452 - SSF48452 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.139    MAL13P1.18    MAL13P1.274    MAL13P1.52    MAL7P1.111    PF07_0026    PF11_0101    PF11_0108    PF11_0124    PF11_0433    PF13_0079    PF13_0107    PF13_0190    PF13_0231    PF14_0042    PF14_0061    PF14_0098    PF14_0196    PF14_0263    PF14_0324    PFB0610c    PFC0330w    PFC0515c    PFD0180c    PFE0085c    PFE0445c    PFE1370w    PFE1545c    PFF0080c    PFF0490w    PFF1020c    PFF1505w    PFI1060w    PFL0280c    PFL0615w    PFL1605w    PFL1735c    PFL2015w    PFL2120w    PFL2275c   


    SSF48464 - ENTH_VHS (Superfamily link)

    Interpro entry IPR008942 : (Interpro link)

    Interpro description:

    This entry represents domains with a multi-helical, alpha-alpha 2-layered structural fold as found in: the ENTH domain of Epsin; the VHS domain of Hrs, Tom1, and ADP-ribosylation factors; the RPR domain of PCF11 protein; and the N-terminal domain of phosphoinositide-binding clathrin adaptor.

    The epsin NH2-terminal homology (ENTH) domain is a membrane interacting module composed of a superhelix of alpha-helices. It is present at the NH2-terminus of proteins that often contain consensus sequences for binding to clathrin coat components and their accessory factors, and therefore function as endocytic adaptors. ENTH domain containing proteins have additional roles in signalling and actin regulation and may have yet other actions in the nucleus. The ENTH domain is structurally similar to the VHS domain.

    The ENTH domain is approximately 150 amino acids long. The ENTH domain forms a compact globular structure, composed of eight alpha-helices connected by loops of varying length. Three helical hairpins that are stacked consecutively with a right-handed twist determine the general topology of the domain. This stacking gives the ENTH domain a rectangular appearance when viewed face on. The most highly conserved amino acids fall roughly into two classes: internal residues that are involved in packing and therefore are necessary for structural integrity, and solvent accessible residues that may be involved in protein-protein interactions.

    VHS domains are found at the N-termini of select proteins involved in intracellular membrane trafficking. The domain consists of eight helices arranged in a superhelix. The surface of the domain has two main features: a basic patch on one side due to several conserved positively charged residues on helix 3 and a negatively charged ridge on the opposite side, formed by residues on helix 2. Comparison of the two VHS domains and the ENTH domain reveals a conserved surface, composed of helices 2 and 4, that is utilised for protein-protein interactions. In addition, VHS domain-containing proteins are also often localized to membranes. It has therefore been suggested that the conserved positively charged surface of helix 3 in VHS and ENTH domains plays a role in membrane binding.

    Proteins where this domain is known:
    PF13_0041    PF14_0569    PFL2195w   


    SSF48537 - PLC_Nuclease (Superfamily link)

    Interpro entry IPR008947 : Phospholipase C/P1 nuclease, core (Interpro link)

    Interpro description:

    The enzymes belonging to this family are involved in phosphate ester hydrolysis and contain a triad of closely spaced zinc ions at their active centres. Both families of enzymes hydrolyse phosphodiesters. Substrates for phospholipase C are phosphatidylinositol and phosphatidylcholine, while P1 nuclease is an endonuclease hydrolysing single stranded ribo- and deoxyribonucleotides. P1 nuclease also has activity as a phosphomonoesterase against 3'-terminal phosphates of nucleotides. The Zn ions in both enzymes form almost identical trinuclear sites.

    Proteins where this domain is known:
    PF14_0117    PF14_0119    PFI0385c   


    SSF48557 - L-Aspartase-like (Superfamily link)

    Interpro entry IPR008948 : L-Aspartase-like (Interpro link)

    Interpro description:

    The enzyme L-aspartate ammonia-lyase (aspartase) catalyses the reversible deamination of the amino acid L-aspartic acid, using a carbanion mechanism to produce fumaric acid and ammonium ion. Aspartases from different organisms show high sequence homology, and this homology extends to functionally related enzymes such as the class II fumarases, the argininosuccinate and adenylosuccinate lyases. The high-resolution structure of aspartase reveals a monomer that is composed of three domains oriented in an elongated S-shape. The central domain, comprised of five-helices, provides the subunit contacts in the functionally active tetramer. The active sites are located in clefts between the subunits and structural and mutagenic studies have identified several of the active site functional groups. A separate regulatory site has been identified. The substrate, aspartic acid, can also play the role of an activator, binding at this site along with a required divalent metal ion.

    Proteins where this domain is known:
    PFB0295w   


    SSF48576 - Terpenoid_synth (Superfamily link)

    Interpro entry IPR008949 : (Interpro link)

    Interpro description:

    Terpenoid cyclases catalyze remarkably complex cyclisation cascades that are initiated by the formation of a highly reactive carbocation in a polyisoprene substrate. The pathways of monoterpene, sesquiterpene, and diterpene biosynthesis are conveniently divided into several stages. The first encompasses the synthesis of isopentenyl diphosphate, isomerization to dimethylallyl diphosphate, prenyltransferase-catalysed condensation of these two C5-units to geranyl diphosphate (GDP), and the subsequent 1'-4 additions of isopentenyl diphosphate to generate farnesyl (FDP) and geranylgeranyl (GGDP) diphosphate. In the second stage, the prenyl diphosphates undergo a range of cyclisations based on variations on the same mechanistic theme to produce the parent skeletons of each class. Thus, GDP (C10) gives rise to monoterpenes, FDP (C15) to sesquiterpenes, and GGDP (C20) to diterpenes. These transformations catalysed by the terpenoid synthases (cyclases) may be followed by a variety of redox modifications of the parent skeletal types to produce the many thousands of different terpenoid metabolites of the essential oils, turpentines, and resins of plant origin. Terpenoid synthases enzymes provide a template for binding and stabilizing the flexible substrate in the precise orientation required for catalysis, trigger carbocation formation, chaperone the conformations of the reactive carbocation intermediates through a unique cyclisation sequence, and sequester and stabilize carbocations from premature quenching.

    Proteins where this domain is known:
    PF11_0295    PFB0130w   


    SSF48592 - GroEL-ATPase (Superfamily link)

    Interpro entry IPR002423 : Chaperonin Cpn60/TCP-1 (Interpro link)

    Interpro description:

    Partially folded polypeptide chains, either newly made by ribosomes or emerging from mature proteins unfolded by stress, run the risk of aggregating with one another to the detriment of the organism. Folding of newly synthesised polypeptides in the crowded cellular environment requires the assistance of molecular chaperone proteins, such as the large bacterial chaperonins GroEL and GroES.

    GroEL and GroES prevent aggregation by encapsulating individual chains within the so-called 'Anfinsen cage' provided by the GroEL-GroES complex, where they can fold in isolation from one another. GroEL consists of two heptameric rings of identical ATPase subunits stacked back to back, containing a cage in each ring. Each subunit consists of three domains. The equatorial domain contains the nucleotide binding site and is connected by a flexible intermediate domain with the apical domain. The latter presents several hydrophobic amino-acid side chains at the top of the ring, orientated towards the cavity of the cage. These side chains are involved in binding either a partially folded polypeptide chain or a single molecule of GroES.

    The assembly of proteins has been thought to be the sole result of properties inherent in the primary sequence of polypeptides themselves. In some cases, however, structural information from other protein molecules is required for correct folding and subsequent assembly into oligomers. These 'helper' molecules are referred to as molecular chaperones, a subfamily of which are the chaperonins, which include 10 kDa and 60 kDa proteins. These are found in abundance in prokaryotes, chloroplasts and mitochondria. They are required for normal cell growth (as demonstrated by the fact that no temperature sensitive mutants for the chaperonin genes can be found in the temperature range 20 to 43 degrees centigrade), and are stress-induced, acting to stabilise or protect disassembled polypeptides under heat-shock conditions.

    The 10 kDa chaperonin (cpn10 - or groES in bacteria) exists as a ring-shaped oligomer of between 6 to 8 identical subunits, whereas the 60 kDa chaperonin (cpn60 - or groEL in bacteria) forms a structure comprising 2 stacked rings, each ring containing 7 identical subunits. These ring structures assemble by self-stimulation in the presence of Mg2+-ATP. The cpn10 and cpn60 oligomers also require Mg2+-ATP in order to interact to form a functional complex, although the mechanism of this interaction is as yet unknown. This chaperonin complex is essential for the correct folding and assembly of polypeptides into oligomeric structures, of which the chaperonins themselves are not a part. The binding of cpn10 to cpn60 inhibits the weak ATPase activity of cpn60.

    The 60 kDa form of chaperonin is the immunodominant antigen of patients with Legionnaire's disease, and is thought to play a role in the protection of the Legionella bacteria from oxygen radicals within macrophages. This hypothesis is based on the finding that the cpn60 gene is upregulated in response to hydrogen peroxide, a source of oxygen radicals. Cpn60 has also been found to display strong antigenicity in many bacterial species, and has the potential for inducing immune protection against unrelated bacterial infections. The RuBisCO subunit binding protein (which has been implicated in the assembly of RuBisCO) and cpn60 have been found to be evolutionary homologues, the RuBisCO subunit binding protein having the C-terminal Gly-Gly-Met repeat found in all bacterial cpn60 sequences. Although the precise function of this repeat is unknown, it is thought to be important as it is also found in 70 kDa heat-shock proteins. The crystal structure of Escherichia coli GroEL has been resolved to 2.8A. The TCP-1 family of proteins act as molecular chaperones for tubulin, actin and probably some other proteins. They are weakly, but significantly, related to the cpn60/groEL chaperonin family.

    Proteins where this domain is known:
    MAL13P1.283    PF10_0153    PF11_0331    PFB0635w    PFC0285c    PFC0350c    PFC0900w    PFF0430w    PFL1425w    PFL1545c   


    SSF48613 - Heme_oxygenase (Superfamily link)

    Interpro entry IPR016084 : (Interpro link)

    Interpro description:

    This entry represents a multi-helical structural domain consisting of two structural repeats (duplication) of a 3-helical motif. This domain can be found in both eukaryotic and prokaryotic haem oxygenases, in TENA/THI-4 proteins that lack the haem-binding site, and in coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein C (PqqC).

    Haem oxygenase (HO) is the microsomal enzyme that, in animals, carries out the oxidation of haem, cleaving the haem ring at the alpha-methene bridge to form biliverdin and carbon monoxide. Biliverdin is subsequently converted to bilirubin by biliverdin reductase. In mammals there are three isozymes of haem oxygenase: HO-1 to HO-3. The first two isozymes differ in their tissue expression and their inducibility: HO-1 is highly inducible by its substrate haem and by various non-haem substances, while HO-2 is non-inducible. Haem oxygenase is also present in certain bacteria, where it is involved in the acquisition of iron from the host haem.

    The THI-4 protein is involved in thiamine biosynthesis, while TENA is one of a number of proteins that enhance the expression of extracellular enzymes, such as alkaline protease, neutral protease and levansucrase.

    Coenzyme PQQ (pyrrolo-quinoline-quinone) biosynthesis protein C (PqqC; is required for the synthesis of PQQ, where PQQ is a prosthetic group found in several bacterial enzymes, including methanol dehydrogenase of methylotrophs and the glucose dehydrogenase of a number of bacteria.

    Proteins where this domain is known:
    PF10_0116   


    SSF48662 - Ribosomal_L39 (Superfamily link)

    Interpro entry IPR000077 : Ribosomal protein L39e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins are very basic. About 50 residues long, they are the smallest proteins of eukaryotic-type ribosomes.

    Proteins where this domain is known:
    PFF0573c   


    SSF48690 - ATP_synth_E (Superfamily link)

    Interpro entry IPR006721 : ATPase, F1 complex, epsilon subunit, mitochondrial (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family constitutes the mitochondrial ATP synthase epsilon subunit, which is distinct from the bacterial epsilon subunit (the latter being homologous to the mitochondrial delta subunit). The mitochondrial epsilon subunit is located in the stalk region of the F1 complex, and acts as an inhibitor of the ATPase catalytic core. The epsilon subunit can assume two conformations, contracted and extended, where the latter inhibits ATP hydrolysis. The conformation of the epsilon subunit is determined by the direction of rotation of the gamma subunit, and possibly by the presence of ADP. The extended epsilon subunit is thought to become extended in the presence of ADP, thereby acting as a safety lock to prevent wasteful ATP hydrolysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    MAL7P1.75   


    SSF48726 - Immunoglobulin (Superfamily link)

    Proteins where this domain is known:
    PF10_0374   


    SSF49348 - Clath_adapt (Superfamily link)

    Interpro entry IPR013041 : Clathrin/coatomer adaptor, adaptin-like, appendage, Ig-like subdomain (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.

    Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.

    While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.

    This entry represents a beta-sandwich structural motif found in the appendage (ear) domain of alpha-, beta- and gamma-adaptin from AP clathrin adaptor complexes, the GAE (gamma-adaptin ear) domain of GGA adaptor proteins, and the appendage domain of the gamma subunit of coatomer complexes. These domains have an immunoglobulin-like beta-sandwich fold containing 7 or 8 strands in 2 beta-sheets in a Greek key topology. Although the appendage domains from AP / GGA adaptins and coatomers share a similar fold, there is little sequence identity between them. However, they also share similar motif-based cargo recognition and accessory factor recruitment mechanisms.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0463    PF14_0529    PFE1400c    PFF0830w    PFL2220w   


    SSF49354 - PapD-like (Superfamily link)

    Interpro entry IPR008962 : (Interpro link)

    Interpro description:

    The PapD-like superfamily of periplasmic chaperones directs the assembly of over 30 diverse adhesive surface organelles that mediate the attachment of many different pathogenic bacteria to host tissues, a critical early step in the development of disease. PapD, the prototypical chaperone, is necessary for the assembly of P pili. P pili contain the adhesin PapG, which mediates the attachment of uropathogenic Escherichia coli to Gal(alpha) Gal receptors present on kidney cells and are critical for the initiation of pyelonephritis. The PapD-like chaperones consist of two Ig-like domains oriented toward each other, forming L-shaped molecules. In the chaperone-subunit complex, the G1beta strand of the chaperone completes an atypical Ig fold in the subunit by occupying the groove and running parallel to the subunit C-terminal F strand. This donor strand complementation interaction simultaneously stabilizes pilus subunits and caps their interactive surfaces, preventing their premature oligomerisation in the periplasm. During pilus biogenesis, the highly conserved N-terminal extension of one subunit has been proposed to displace the chaperone G1beta strand from its neighbouring subunit in a mechanism termed donor strand exchange.

    This entry represents the immunoglobulin (Ig)-like beta-sandwich domain found in PapD, as well as in other periplasmic chaperone proteins that include FimC and SfaE from E. coli, and Caf1m from Yersinia pestis. In addition, major sperm proteins (MSP) and other related sperm proteins (such as WR4 and SSP-19) contain an Ig-like domain with a similar structural fold to PapD. Major sperm proteins are central components in molecular interactions underlying sperm motility, with many isoforms existing in Caenorhabditis elegans.

    Proteins where this domain is known:
    PF14_0377   


    SSF49417 - P53_like_DNA_bnd (Superfamily link)

    Interpro entry IPR008967 : p53-like transcription factor, DNA-binding (Interpro link)

    Interpro description:

    This domain is found in a number of transcription factors, including p53, NFATC, TonEBP, STAT-1, and NFkappaB, where it is responsible for DNA-binding. These transcription factors play diverse roles in the regulation of cellular functions: the p53 tumour suppressor upregulates the expression of genes involved in cell cycle arrest and apoptosis; NFATC regulates the production of effector proteins involved in coordinating the immune response; TonEBP regulates gene expression induced by osmotic stress and helps regulate intracellular volume during cell growth; STAT-1 plays an important role in B lymphocyte growth and function; and NFkappaB is involved in the inflammatory response. The DNA-binding domain acts to clamp, or in the case of TonEBP, encircle the DNA target in order to stabilize the protein-DNA complex. Protein interactions may also serve to stabilize the protein-DNA complex, for example in the STAT-1 dimer the SH2 (Src homology 2) domain in each monomer is coupled to the DNA-binding domain to increase stability. The DNA-binding domain consists of a beta-sandwich formed of 9 strands in 2 sheets with a Greek-key topology. This structure is found in many transcription factors, often within the DNA-binding domain.

    Proteins where this domain is known:
    PF11_0091   


    SSF49447 - AP50 (Superfamily link)

    Interpro entry IPR008968 : Clathrin adaptor, mu subunit, C-terminal (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    AP (adaptor protein) complexes are found in coated vesicles and clathrin-coated pits. AP complexes connect cargo proteins and lipids to clathrin at vesicle budding sites, as well as binding accessory proteins that regulate coat assembly and disassembly (such as AP180, epsins and auxilin). There are different AP complexes in mammals. AP1 is responsible for the transport of lysosomal hydrolases between the TGN and endosomes. AP2 associates with the plasma membrane and is responsible for endocytosis. AP3 is responsible for protein trafficking to lysosomes and other related organelles. AP4 is less well characterised. AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). For example, in AP1 these subunits are gamma-1-adaptin, beta-1-adaptin, mu-1 and sigma-1, while in AP2 they are alpha-adaptin, beta-2-adaptin, mu-2 and sigma-2. Each subunit has a specific function. Adaptins recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal ear (appendage) domains. Mu recognises tyrosine-based sorting signals within the cytoplasmic domains of transmembrane cargo proteins. One function of clathrin and AP2 complex-mediated endocytosis is to regulate the number of GABA(A) receptors available at the cell surface .

    This entry represents the C-terminal domain of the mu subunit from various clathrin adaptors (AP1, AP2 and AP3). The C-teminal domain has an immunoglobulin-like beta-sandwich fold consisting of 9 strands in 2 sheets with a Greek key topology, similar to that found in cytochrome f and certain transcription factors. The mu subunit regulates the coupling of clathrin lattices with particular membrane proteins by self-phosphorylation via a mechanism that is still unclear. The mu subunit possesses a highly conserved N-terminal domain of around 230 amino acids, which may be the region of interaction with other AP proteins; a linker region of between 10 and 42 amino acids; and a less well-conserved C-terminal domain of around 190 amino acids, which may be the site of specific interaction with the protein being transported in the vesicle.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0202    PF13_0062    PF14_0386    PFL0885w   


    SSF49493 - HSP40_DnaJ_pep (Superfamily link)

    Interpro entry IPR008971 : HSP40/DnaJ peptide-binding (Interpro link)

    Interpro description:

    The Escherichia coli Hsp40 DnaJ and Hsp70 DnaK cooperate in the binding of proteins at intermediate stages of folding, assembly, and translocation across membranes. Binding of protein substrates to the DnaK C-terminal domain is controlled by ATP binding and hydrolysis in the N-terminal ATPase domain. The interaction of DnaJ with DnaK is mediated at least in part by the highly conserved N-terminal J-domain of DnaJ. The J-domain interaction is localized to the ATPase domain of DnaK and is likely to be dominated by electrostatic interactions. J-domain may tether DnaK to DnaJ-bound substrates, which DnaK then binds with its C-terminal peptide-binding domain. The peptide-binding domain of DnaJ is comprised of a beta sandwich made up of 6 beta-strands divided into 2 sheets.

    Proteins where this domain is known:
    PF14_0359    PFA0660w    PFB0090c    PFB0595w    PFD0462w    PFE0055c    PFF1415c   


    SSF49503 - Cupredoxin (Superfamily link)

    Interpro entry IPR008972 : (Interpro link)

    Interpro description:

    Copper is one of the most prevalent transition metals in living organisms and its biological function is intimately related to its redox properties. Since free copper is toxic, even at very low concentrations, its homeostasis in living organisms is tightly controlled by subtle molecular mechanisms. In eukaryotes, before being transported inside the cell via the high-affinity copper transporters of the CTR family, the copper (II) ion is reduced to copper (I). In blue copper proteins such as Cupredoxin, the copper (I) ion form is stabilised by a constrained His2Cys coordination environment.

    This entry represents cupredoxin proteins, as well as structural homologues to cupredoxin. Structurally, the cupredoxin-like fold consists of a beta-sandwich with 7 strands in 2 beta-sheets, which is arranged in a Greek-key beta-barrel. Some of these proteins have lost the ability to bind copper. Proteins with a cupredoxin-type fold are found in the following family groups:

    Proteins where this domain is known:
    PF13_0327    PF14_0288   


    SSF49562 - C2_CaLB (Superfamily link)

    Interpro entry IPR008973 : (Interpro link)

    Interpro description:

    The Ca2+-dependent, lipid-binding domain (CaLB) has been identified in a number of proteins, for example the amino-terminal, 138 amino acid C2 domain of cytosolic phospholipase A2 (cPLA2-C2) which mediates an initial step in the production of lipid mediators of inflammation: the Ca2+-dependent translocation of the enzyme to intracellular membranes with subsequent liberation of arachidonic acid. The domain is composed of eight antiparallel beta-strands with six interconnecting loops that fits the "type II" topology for C2 domains. The structure has been identified as a beta-sandwich in the "Greek key" motif.

    Proteins where this domain is known:
    MAL8P1.134    PF10_0362    PF11_0107    PF13_0289    PF14_0530    PFF0185c    PFL0925w    PFL2110c   


    SSF49599 - Traf_like (Superfamily link)

    Interpro entry IPR008974 : (Interpro link)

    Interpro description:

    The tumour necrosis factor receptor (TNFR) associated factors (TRAFs) act as signal transducers for both TNFRs and interleukin-1/Toll-like receptors. TRAFs function in immunity, embryonic development, stress response and bone metabolism through their induction of cell proliferation, differentiation, and apoptosis. TRAFs are characterised by two domains: an N-terminal domain containing RING and zinc finger motifs that is essential for the activation of downstream effectors, and a C-terminal TRAF domain that is essential for self-association and receptor interaction. The TRAF-domain like fold is a beta-sandwich consisting of 8 strands in 2 beta sheets and has a circularly permuted greek-key immunoglobulin-fold topology that contains an extra strand.

    The substrate-binding domain (SBD) of the SIAH (seven in absentia homolog) family of proteins is structurally highly similar to the TRAF domain. The SIAH SBD interacts with a number of proteins, and is involved in TNF-alpha-mediated NFkappaB activation.

    Proteins where this domain is known:
    PF10_0127    PFE0570w   


    SSF49723 - Lipase_LipOase (Superfamily link)

    Interpro entry IPR008976 : (Interpro link)

    Interpro description:

    Lipoxygenases are a class of iron-containing dioxygenases which catalyses the hydroperoxidation of lipids, containing a cis,cis-1,4-pentadiene structure. They are common in plants where they may be involved in a number of diverse aspects of plant physiology including growth and development, pest resistance, and senescence or responses to wounding. In mammals a number of lipoxygenases isozymes are involved in the metabolism of prostaglandins and leukotrienes. Sequence data is available for the following lipoxygenases:

    The iron atom in lipoxygenases is bound by four ligands, three of which are histidine residues. Six histidines are conserved in all lipoxygenase sequences, five of them are found clustered in a stretch of 40 amino acids. This region contains two of the three zinc-ligands; the other histidines have been shown to be important for the activity of lipoxygenases.

    This entry represents a domain found in lipoxygenases and other enzymes. It is known as the PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology) domain, is found in a variety of membrane or lipid associated proteins. Structurally, this domain forms a beta-sandwich composed of two sheets of four strands each. The most highly conserved regions coincide with the beta-strands, with most of the highly conserved residues being buried within the protein. An exception to this is a surface lysine or arginine that occurs on the surface of the fifth beta-strand of the eukaryotic domains. In pancreatic lipase, the lysine in this position forms a salt bridge with the procolipase protein. The conservation of a charged surface residue may indicate the location of a conserved ligand-binding site. It is thought that this domain may mediate membrane attachment via other protein binding partners.

    Proteins where this domain is known:
    PF14_0067   


    SSF49758 - Peptidase_C2 (Superfamily link)

    Interpro entry IPR001300 : Peptidase C2, calpain (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This group of cysteine peptidases belong to the MEROPS peptidase family C2 (calpain family, clan CA). A type example is calpain, which is an intracellular protease involved in many important cellular functions that are regulated by calcium. The protein is a complex of 2 polypeptide chains (light and heavy), with three known forms in mammals: a highly calcium-sensitive (i.e., micro-molar range) form known as mu-calpain, mu-CANP or calpain I; a form sensitive to calcium in the milli-molar range, known as m-calpain, m-CANP or calpain II; and a third form, known as p94, which is found in skeletal muscle only.

    All forms have identical light but different heavy chains. Both mu- and m-calpain are heterodimers containing an identical 28-kDa subunit and an 80-kDa subunit that shares 55-65% sequence homology between the two proteases. The crystallographic structure of m-calpain reveals six "domains" in the 80-kDa subunit:

    1. A 19-amino acid NH2-terminal sequence;
    2. Active site domain IIa;
    3. Active site domain IIb.

      Domain 2 shows low levels of sequence similarity to papain; although the catalytic His has not been located by biochemical means, it is likely that calpain and papain are related.

    4. Domain III;
    5. An 18-amino acid extended sequence linking domain III to domain IV;
    6. Domain IV, which resembles the penta EF-hand family of polypeptides, binds calcium and regulates activity. />. Ca2+-binding causes a rearrangement of the protein backbone, the net effect of which is that a Trp side chain, which acts as a wedge between catalytic domains IIa and IIb in the apo state, moves away from the active site cleft allowing for the proper formation of the catalytic triad.

    Calpain-like mRNAs have been identified in other organisms including bacteria, but the molecules encoded by these mRNAs have not been isolated, so little is known about their properties. How calpain activity is regulated in these organisms cells is still unclear In metazoans, the activity of calpain is controlled by a single proteinase inhibitor, calpastatin. The calpastatin gene can produce eight or more calpastatin polypeptides ranging from 17 to 85 kDa by use of different promoters and alternative splicing events. The physiological significance of these different calpastatins is unclear, although all bind to three different places on the calpain molecule; binding to at least two of the sites is Ca2+ dependent. The calpains ostensibly participate in a variety of cellular processes including remodelling of cytoskeletal/membrane attachments, different signal transduction pathways, and apoptosis. Deregulated calpain activity following loss of Ca2+ homeostasis results in tissue damage in response to events such as myocardial infarcts, stroke, and brain trauma.

    Proteins where this domain is known:
    MAL13P1.310   


    SSF49764 - HSP20_chap (Superfamily link)

    Interpro entry IPR008978 : (Interpro link)

    Interpro description:

    Hsp20 is a mammalian small heat-shock protein family that occurs most abundantly in skeletal muscle and heart. It has a tendency to form dimers, via a disulphide linkage formed by an N-terminal cysteine, low heat stability and a poor chaperoning ability in comparison with other family members. Structurally, this and related proteins contain a beta-sandwich fold consisting of 8 strands in 2 beta-sheets in a greek-key topology.

    Proteins where this domain is known:
    MAL8P1.78    MAL8P1.96    PF13_0021    PF13_0204    PF14_0465    PF14_0510    PFC0581w    PFI0990c    PFI1325w    PFL0550w    PFL1765c    PFL1845c    PFL2450c   


    SSF49777 - PEBP (Superfamily link)

    Interpro entry IPR008914 : (Interpro link)

    Interpro description:

    The PEBP (PhosphatidylEthanolamine-Binding Protein) family is a highly conserved group of proteins that have been identified in numerous tissues in a wide variety of organisms, including bacteria, yeast, nematodes, plants, drosophila and mammals. The various functions described for members of this family include lipid binding, neuronal development, serine protease inhibition, the control of the morphological switch between shoot growth and flower structures, and the regulation of several signalling pathways such as the MAP kinase pathway, and the NF-kappaB pathway. The control of the latter two pathways involves the PEBP protein RKIP, which interacts with MEK and Raf-1 to inhibit the MAP kinase pathway, and with TAK1, NIK, IKKalpha and IKKbeta to inhibit the NF-kappaB pathway. Other PEBP-like proteins that show strong structural homology to PEBP include Escherichia coli YBHB and YBCL, the Rattus norvegicus (Rat) neuropeptide HCNP, and Antirrhinum majus (Garden snapdragon) protein centroradialis (CEN).

    Structures have been determined for several members of the PEBP-like family, all of which show extensive fold conservation. The structure consists of a large central beta-sheet flanked by a smaller beta-sheet on one side, and an alpha helix on the other. Sequence alignments show two conserved central regions, CR1 and CR2, that form a consensus signature for the PEBP family. These two regions form part of the ligand-binding site, which can accommodate various anionic groups. The N- and C-terminal regions are the least conserved, and may be involved in interactions with different protein partners. The N-terminal residues 2-12 form the natural cleavage peptide HCNP involved in neuronal development. The C-terminal region is deleted in plant and bacterial PEBP homologues, and may help control accessibility to the active site.

    Proteins where this domain is known:
    PFC0176c    PFL0955c   


    SSF49785 - Gal_bind_like (Superfamily link)

    Interpro entry IPR008979 : (Interpro link)

    Interpro description:

    Proteins containing a galactose-binding domain-like fold can be found in several different protein families, in both eukaryotes and prokaryotes. The common function of these domains is to bind to specific ligands, such as cell-surface-attached carbohydrate substrates for galactose oxidase and sialidase, phospholipids on the outer side of the mammalian cell membrane for coagulation factor Va, membrane-anchored ephrin for the Eph family of receptor tyrosine kinases, and a complex of broken single-stranded DNA and DNA polymerase beta for XRCC1.

    The structure of the galactose-binding domain-like members consists of a beta-sandwich, in which the strands making up the sheets exhibit a jellyroll fold. There is a high degree of similarity in the beta-sandwich and in the loops between different family members, despite an often low level of sequence similarity.

    Proteins where this domain is known:
    PF07_0120    PF14_0384    PF14_0532    PF14_0723    PFA0445w    PFL0850w   


    SSF49879 - SMAD_FHA (Superfamily link)

    Interpro entry IPR008984 : (Interpro link)

    Interpro description:

    FHA and SMAD (MH2) domains share a common structure consisting of a sandwich of eleven beta strands in two sheets with Greek key topology. Forkhead-associated (FHA) domains were originally identified as a sequence profile of about 75 amino acids, whereas the full-length domain is closer to about 150 amino acids. FHA domains are found in transcription factors, kinesin motors, and in a variety of other signalling molecules in organisms ranging from eubacteria to humans. FHA domains are protein-protein interaction domains that are specific for phosphoproteins. FHA-containing proteins function in maintaining cell-cycle checkpoints, DNA repair and transcriptional regulation. FHA domain proteins include the Chk2/Rad53/Cds1 family of proteins that contain one or more FHA domains, as well as a Ser/Thr kinase domain.

    SMAD (Mothers against decapentaplegic (MAD) homolog) domain proteins are found in a range of species from nematodes to humans. These highly conserved proteins contain an N-terminal MH1 domain that contacts DNA, and is separated by a short linker region from the C-terminal MH2 domain, the later showing a striking similarity to FHA domains. SMAD proteins mediate signalling by the TGF-beta/activin/BMP-2/4 cytokines from receptor Ser/Thr protein kinases at the cell surface to the nucleus. SMAD proteins fall into three functional classes: the receptor-regulated SMADs (R-SMADs), including SMAD1, -2, -3, -5, and -8, each of which is involved in a ligand-specific signalling pathway; the comediator SMADs (co-SMADs), including SMAD4, which interact with R-SMADs to participate in signalling; and the inhibitory SMADs (I-SMADs), including SMAD6 and -7, which block the activation of R-SMADs and Co-SMADs, thereby negatively regulating signalling pathways.

    Domains with this fold are also found as the transactivation domain of interferon regulatory factor 3 (IRF3), which has a weak homology to SMAD domains, and the N-terminal domain of EssC protein in Staphylococcus aureus.

    Proteins where this domain is known:
    MAL13P1.405    PF11_0347    PF13_0042    PFI0470w    PFL0275w   


    SSF49899 - ConA_like_lec_gl (Superfamily link)

    Interpro entry IPR008985 : (Interpro link)

    Interpro description:

    Lectins and glucanases exhibit the common property of reversibly binding to specific complex carbohydrates. The lectins/glucanases are a diverse group of proteins found in a wide range of species from prokaryotes to humans. The different family members all contain a concanavalin A-like domain, which consists of a sandwich of 12-14 beta strands in two sheets with a complex topology. Members of this family are diverse, and include the lectins: legume lectins, cereal lectins, viral lectins, and animal lectins. Plant lectins function in the storage and transport of carbohydrates in seeds, the binding of nitrogen-fixing bacteria to root hairs, the inhibition of fungal growth or insect feeding, and in hormonally regulated plant growth. Protein members include concanavalin A (Con A), favin, isolectin I, lectin IV, soybean agglutinin and lentil lectin. Animal lectins include the galectins, which are S-type lactose-binding and IgE-binding proteins such as S-lectin, CLC protein, galectin1, galectin2, galectin3 CRD, and Congerin I.

    Other members with a Con A-like domain include the glucanases and xylanases. Bacterial and fungal beta-glucanases, such as Bacillus 1-3,1-4-beta-glucanse, carry out the acid catalysis of beta-glucans found in microorganisms and plants. Similarly, kappa-Carrageenase degrades kappa-carrageenans from marine red algae cell walls. Xylanase and cellobiohydrolase I degrade hemicellulose and cellulose, respectively.

    There are many Con A-like domains found in proteins involved in cell recognition and adhesion. For example, several viral and bacterial toxins carry Con A-like domains. Examples include the Clostridium neurotoxins responsible for the neuroparalytic effects of botulism and tetanus. The Pseudomonas exotoxin A, a virulence factor which is highly toxic to eukaryotic cells, causing the arrest of protein synthesis, contains a Con A-like domain involved in receptor binding. Cholerae neuraminidase can bind to cell surfaces, possibly through their Con A-like domains, where they function as part of a mucinase complex to degrade the mucin layer of the gastrointestinal tract. The rotaviral outer capsid protein, VP4, has a Con A-like sialic acid binding domain, which functions in cell attachment and membrane penetration.

    Con A-like domains also play a role in cell recognition in eukaryotes. Proteins containing a Con A-like domain include the sex hormone-binding globulins which transport sex steroids in blood and regulate their access to target tissues, laminins which are large heterotrimeric glycoproteins involved in basement membrane architecture and function, neurexins which are expressed in hundreds of isoforms on the neuronal cell surface, where they may function as cell recognition molecules and sialidases that are found in both microorganisms and animals, and function in cell adhesion and signal transduction.

    Other proteins containing a Con A-like domain include pentraxins and calnexins. The pentraxin PTX3 is a TNFalpha-induced, secreted protein of adipose cells produced during inflammation. The calnexin family of molecular chaperones is conserved among plants, fungi, and animals. Family members include Calnexin, a type-I integral membrane protein in the endoplasmic reticulum which coordinates the processing of newly synthesized N-linked glycoproteins with their productive folding, calmegin, a type-I membrane protein expressed mainly in the spermatids of the testis, and calreticulin, a soluble ER lumenal paralog.

    Proteins where this domain is known:
    MAL13P1.19    MAL7P1.124    PF11_0168    PF14_0067    PF14_0084    PF14_0326    PF14_0419    PF14_0722    PFA0195w    PFE0570w    PFE1120w    PFI0185w    PFI0210c    PFL0910c   


    SSF50022 - Rieske_dom (Superfamily link)

    Interpro entry IPR005806 : Rieske [2Fe-2S] region (Interpro link)

    Interpro description:

    Ubiquinol-cytochrome c reductase (bc1 complex or complex III) is an enzyme complex of bacterial and mitochondrial oxidative phosphorylation systems It catalyses the oxidoreduction of the mobile redox components ubiquinol and cytochrome c, generating an electrochemical potential, which is linked to ATP synthesis. The complex consists of three subunits in most bacteria, and nine in mitochondria: both bacterial and mitochondrial complexes contain cytochrome b and cytochrome c1 subunits, and an iron-sulphur 'Rieske' subunit, which contains a high potential 2Fe-2S cluster.The mitochondrial form also includes six other subunits that do not possess redox centres. Plastoquinone-plastocyanin reductase (b6f complex), present in cyanobacteria and the chloroplasts of plants, catalyses the oxidoreduction of plastoquinol and cytochrome f. This complex, which is functionally similar to ubiquinol-cytochrome c reductase, comprises cytochrome b6, cytochrome f and Rieske subunits.

    The Rieske subunit acts by binding either a ubiquinol or plastoquinol anion, transferring an electron to the 2Fe-2S cluster, then releasing the electron to the cytochrome c or cytochrome f haem iron. The rieske domain has a [2Fe-2S] centre. Two conserved cysteines that one Fe ion while the other Fe ion is coordinated by two conserved histidines. The 2Fe-2S cluster is bound in the highly conserved C-terminal region of the Rieske subunit.

    Proteins where this domain is known:
    PF07_0085    PF14_0373   


    SSF50104 - Transl_SH3_like (Superfamily link)

    Interpro entry IPR008991 : (Interpro link)

    Interpro description:

    The fundamental activity of the ribosome is two-fold: to decode the message of the mRNA in the small subunit, and to form a peptide bond between peptidyl-tRNA and aminoacyl-tRNA by a peptidyl transferase activity in the large subunit. Several prokaryotic and eukaryotic proteins that are involved in the translation process contain an SH3-like domain. The structure of the translation protein SH3-like domain is a partly opened beta barrel, where the last strand is interrupted by a 3-10 helical turn. The structure of the RNA-binding C-terminal domain of the Bacillus stearothermophilus (Geobacillus stearothermophilus) ribosomal protein L2 has been shown to adopt the SH3-like barrel topology. The L2 protein is located near the peptidyl transferase centre in the large ribosomal subunit where it may contribute to peptidyl transferase activity, and is involved in the assembly of the 23SrRNA. Likewise, the N-terminal domain of the ubiquitous eukaryotic initiation translation factor 5a (IF-5A) protein adopts the SH3-like barrel topology. IF-5A is involved in the initial step of peptide bond formation in translation and in cell-cycle regulation. IF-5A acts as a cofactor of the Rev protein in HIV-1-infected cells and of the Rex protein in T-cell leukaemia virus 1-infected cells.

    Proteins where this domain is known:
    PF11_0337    PF14_0240    PFC0535w    PFE0845c    PFF0245w    PFI0655c    PFL0210c    PFL1150c   


    SSF50129 - GroES_like (Superfamily link)

    Interpro entry IPR011032 : (Interpro link)

    Interpro description:

    GroES (chaperonin 10) is an oligomeric molecular chaperone, which functions in protein folding and possibly in intercellular signalling, being found on the surface of various prokaryotic and eukaryotic cells, as well as being released from cells. Secreted chaperonins are thought to act as intercellular signals, interacting with a variety of cell types, including leukocytes, vascular endothelial cells and epithelial cells, as well as activating key cellular activities such as the synthesis of cytokines and adhesion proteins. GroES works as a co-chaperone with GroEL (chaperonin 60) during protein folding. The polypeptide substrate is captured by GroEL, which bind the co-chaperone GroES and ATP, and discharges the substrate into a unique microenvironment inside of the chaperone, which promotes productive folding. After hydrolysis of ATP, the polypeptide is released into solution. GP31 from Bacteriophage T4 is functionally equivalent to GroES. GroES folds as a partly opened beta-barrel.

    The N-terminal domain of alcohol dehydrogenase-like proteins have a GroES-like fold, the C-terminal domain having a classical Rossman-fold. These proteins include, alcohol dehydrogenase, which contains a zinc-finger subdomain within the GroES-like domain, ketose reductase (sorbitol dehydrogenase), formaldehyde dehydrogenase, quinone oxidoreductase and 2,4-dienoyl-CoA reductase.

    Proteins where this domain is known:
    PF13_0180    PFL0740c   


    SSF50156 - PDZ (Superfamily link)

    Interpro entry IPR001478 : PDZ/DHR/GLGF (Interpro link)

    Interpro description:

    PDZ domains are found in diverse signalling proteins in bacteria, yeasts, plants, insects and vertebrates. PDZ domains can occur in one or multiple copies and are nearly always found in cytoplasmic proteins. They bind either the carboxyl-terminal sequences of proteins or internal peptide sequences. In most cases, interaction between a PDZ domain and its target is constitutive, with a binding affinity of 1 to 10 µM. However, agonist-dependent activation of cell surface receptors is sometimes required to promote interaction with a PDZ protein. PDZ domain proteins are frequently associated with the plasma membrane, a compartment where high concentrations of phosphatidylinositol 4,5-bisphosphate (PIP2) are found. Direct interaction between PIP2 and a subset of class II PDZ domains (syntenin, CASK, Tiam-1) has been demonstrated.

    PDZ domains consist of 80 to 90 amino acids comprising six beta-strands (beta-A to beta-F) and two alpha-helices, A and B, compactly arranged in a globular structure. Peptide binding of the ligand takes place in an elongated surface groove as an anti-parallel beta-strand interacts with the beta-B strand and the B helix. The structure of PDZ domains allows binding to a free carboxylate group at the end of a peptide through a carboxylate-binding loop between the beta-A and beta-B strands.

    Proteins where this domain is known:
    MAL8P1.98    PFC0330w    PFC0785c   


    SSF50182 - Sm_like_riboprot (Superfamily link)

    Interpro entry IPR010920 : (Interpro link)

    Interpro description:

    This domain is found as the core structure in Lsm (like-Sm) proteins and bacterial Lsm-related Hfq proteins, and as the middle domain of the mechanosensitive channel protein MscS. In each case, the domain adopts a core structure consisting of an open beta-barrel with an SH3-like topology.

    Lsm proteins have diverse functions, and are thought to be important modulators of RNA biogenesis and function. The Sm proteins form part of specific small nuclear ribonucleoproteins (snRNPs) that are involved in the processing of pre-mRNAs to mature mRNAs, and are a major component of the eukaryotic spliceosome. These snRNPs consist of seven Sm proteins (B/BÂ, D1, D2, D3, E, F and G), plus a small nuclear RNA (snRNA) (either U1, U2, U5 or U4/6). Other snRNPs, such as U7 snRNP, can contain different Lsm proteins. Lsm proteins are also found in archaebacteria, which do not have any splicing apparatus suggesting a more general role for Lsm proteins.

    The pleiotropic translational regulator Hfq (host factor Q) is a bacterial Lsm-like protein, which modulates the structure of numerous RNA molecules by binding preferentially to A/U-rich sequences in RNA. Hfq forms an Lsm-like fold, however, unlike the heptameric Sm proteins, Hfq forms a homo-hexameric ring.

    The middle domain of the mechanosensitive channel of small conductance protein (MscS or YggB) structurally resembles an Lsm protein. MscS is a mechanosensitive channel present in the membrane of bacteria, archaea and eukarya that responds both to stretching of the cell membrane and to membrane depolarisation. MscS folds as a homo-heptamer with a cylindrical shape, and can be divided into transmembrane and extramembrane regions: an N-terminal periplasmic region, a transmembrane region, and a C-terminal cytoplasmic region. The C-terminal cytoplasmic region can be further divided into middle and C-terminal domains, which together create a framework that connects to the cytoplasm through distinct openings. The middle domain exhibits an Lsm-like structure, consisting of five beta-strands that pack together with those of other subunits to form a barrel-like sheet extending around the entire protein.

    Proteins where this domain is known:
    MAL13P1.253    MAL8P1.48    MAL8P1.9    PF08_0049    PF11_0092    PF11_0255    PF11_0266    PF11_0280    PF11_0524    PF13_0142    PF14_0146    PF14_0411    PFB0865w    PFE1020w    PFI0475w    PFL0460w   


    SSF50193 - Ribosomal_L14 (Superfamily link)

    Interpro entry IPR000218 : Ribosomal protein L14b/L23e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L14 is one of the proteins from the large ribosomal subunit. In eubacteria, L14 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins, which have been grouped on the basis of sequence similarities. Based on amino-acid sequence homology, it is predicted that ribosomal protein L14 is a member of a recently identified family of structurally related RNA-binding proteins. L14 is a protein of 119 to 137 amino-acid residues.

    Proteins where this domain is known:
    PF13_0171    PFE0960w   


    SSF50199 - Staphylococal_nuclease_OB-fold (Superfamily link)

    Interpro entry IPR016071 : Staphylococcal nuclease (SNase-like), OB-fold (Interpro link)

    Interpro description:

    Staphylococcus aureus nuclease (SNase) homologues, previously thought to be restricted to bacteria and archaea, are also in eukaryotes. Staphylococcal nuclease has a multi-domain organization. The human cellular coactivator p100 contains four repeats, each of which is a SNase homologue. These repeats are unlikely to possess SNase-like activities as each lacks equivalent SNase catalytic residues, yet they may mediate p100's single-stranded DNA-binding function. alA variety of proteins including many that are still uncharacterised belong to this group.

    SNase domains have an OB-fold consisting of a closed or partly open beta-barrel with Greek key topology.

    Proteins where this domain is known:
    PF11_0374   


    SSF50249 - Nucleic_acid_OB (Superfamily link)

    Interpro entry IPR016027 : (Interpro link)

    Interpro description:

    A five-stranded beta-barrel was first noted as a common structure among four proteins binding single-stranded nucleic acids (staphylococcal nuclease and aspartyl-tRNA synthetase) or oligosaccharides (B subunits of enterotoxin and verotoxin-1), and has been termed the oligonucleotide/oligosaccharide binding motif, or OB fold, a five-stranded beta-sheet coiled to form a closed beta-barrel capped by an alpha helix located between the third and fourth strands. Two ribosomal proteins, S17 and S1, are members of this class, and have different variations of the OB fold theme. Comparisons with other OB fold nucleic acid binding proteins suggest somewhat different mechanisms of nucleic acid recognition in each case.

    There are many nucleic acid-binding proteins that contain domains with this OB-fold structure, including anticodon-binding tRNA synthetases, ssDNA-binding proteins (CDC13, telomere-end binding proteins), phage ssDNA-binding proteins (gp32, gp2.5, gpV), cold shock proteins, DNA ligases, RNA-capping enzymes, DNA replication initiators and RNA polymerase subunit RBP8.

    Proteins where this domain is known:
    MAL13P1.327    MAL8P1.101    MAL8P1.18    PF07_0023    PF07_0039    PF07_0117    PF10_0294    PF11_0130    PF11_0332    PF11_0337    PF11_0447    PF13_0095    PF13_0262    PF13_0291    PF14_0144    PF14_0166    PF14_0177    PF14_0307    PF14_0401    PF14_0585    PF14_0658    PFA0145c    PFA0470c    PFB0125c    PFB0525w    PFC0290w    PFC0775w    PFD0470c    PFD0515w    PFD0600c    PFD0790c    PFE0435c    PFE0475w    PFE0715w    PFE0830c    PFE0845c    PFE1345c    PFI0235w    PFI0655c    PFL0210c    PFL0560c    PFL0580w    PFL0665c   


    SSF50324 - Pyrophosphatase (Superfamily link)

    Interpro entry IPR008162 : Inorganic pyrophosphatase (Interpro link)

    Interpro description:

    Inorganic pyrophosphatase (PPase) is the enzyme responsible for the hydrolysis of pyrophosphate (PPi) which is formed principally as the product of the many biosynthetic reactions that utilise ATP. All known PPases require the presence of divalent metal cations, with magnesium conferring the highest activity. Among other residues, a lysine has been postulated to be part of or close to the active site. PPases have been sequenced from bacteria such as Escherichia coli (homohexamer), Bacillus PS3 (Thermophilic bacterium PS-3) and Thermus thermophilus, from the archaebacteria Thermoplasma acidophilum, from fungi (homodimer), from a plant, and from bovine retina. In yeast, a mitochondrial isoform of PPase has been characterised which seems to be involved in energy production and whose activity is stimulated by uncouplers of ATP synthesis.

    The sequences of PPases share some regions of similarities, among which is a region that contains three conserved aspartates that are involved in the binding of cations.

    Proteins where this domain is known:
    PFC0710w   


    SSF50370 - RicinB_like (Superfamily link)

    Interpro entry IPR008997 : (Interpro link)

    Interpro description:

    The plant cytotoxin ricin is a heterodimer. The A chain, known to be a specific N-glycosidase, has a prominent active site cleft. The B chain is a two-domain lectin, which arose from the replication of a primitive sugar binding peptide. The B chain subunit of ricin (RTB)1 binds to mammalian cell membranes by recognising galactose-containing receptors. RTB has two domains each with three subdomains; tripeptide kinks in the loops from subdomains 1alpha, 1beta, 2alpha, and 2gamma may interact with galactosides. Each of these subdomains has aromatic residues that can interact with the nonpolar face of galactose, and three of the four subdomain folds (1alpha, 1beta, and 2gamma) have polar residues for hydrogen bond formation to the sugar hydroxyls.

    The family 10 xylanase from Streptomyces olivaceoviridis (Streptomyces corchorusii) E-86 contains a (beta/alpha)(8)-barrel as a catalytic domain, a family 13 carbohydrate binding module as a xylan binding domain (XBD) and a Gly/Pro-rich linker between them. The crystal structure of this enzyme showed that XBD has three similar subdomains, as indicated by the presence of a triple-repeated sequence, forming a galactose binding lectin fold similar to that found in the ricin toxin B-chain.

    Proteins where this domain is known:
    PF14_0532    PF14_0723   


    SSF50447 - Translat_factor (Superfamily link)

    Interpro entry IPR009000 : (Interpro link)

    Interpro description:

    A beta barrel of circularly permuted topology is found in many transcription proteins, including initiation and elongation factors, and also some ribosomal proteins, although in these cases the fold is elaborated with additional structures. The beta barrel domain is represented by domain 2 of the elongation factors EF-Tu and eEF1A, both of which function to recognize and transport aminoacyl-tRNA to the acceptor (A) site of the ribosome during the elongation process, and of EF-G, which functions in translocating the peptidyl tRNA from the A site to the peptidyl (P) site. This domain is also present in initiation factors, in domain 2 of eIF2 gamma subunit, and domains 2 and 4 of IF2/eIF5B, both of which function to transport the initiator methionyl-tRNA to the ribosome. This beta barrel domain may be involved in interactions with the switch 2 region to stabilise the relative orientations of the domains, which undergo functionally important conformational changes between GTP- and GDP-bound states.

    More information about translation elongation factors can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    MAL13P1.164    MAL13P1.243    PF07_0062    PF08_0018    PF10_0041    PF10_0272    PF11_0245    PF13_0069    PF13_0304    PF13_0305    PF14_0104    PF14_0486    PFA0495c    PFE0830c    PFF0115c    PFF0345w    PFI0570w    PFI0890c    PFL1590c    PFL1710c    PFL2180w   


    SSF50465 - Elong_init_C (Superfamily link)

    Interpro entry IPR009001 : (Interpro link)

    Interpro description:

    A beta barrel of circularly permuted topology is found in the C-terminus of many translation elongation and initiation factors. This domain is found in the elongation factors EF1A (or EF-Tu) of both eukaryotes and prokaryotes, which functions to recognize and transport aminoacyl-tRNA to the acceptor (A) site of the ribosome during the elongation process. This domain is also found in the initiation factor IF2 gamma subunit of eukaryotes, which functions to transport the initiator methionyl-tRNA to the ribosome. The C-terminal extension of mitochondrial EF1A (or EF-Tu) has structural similarities with DNA recognising zinc fingers, suggesting that the extension may be involved in recognition of RNA.

    More information about EF1A proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    MAL13P1.164    PF11_0245    PF13_0304    PF13_0305    PF14_0104   


    SSF50486 - FMT_C_like (Superfamily link)

    Interpro entry IPR011034 : Formyl transferase, C-terminal-like (Interpro link)

    Interpro description:

    Methionyl-tRNA formyltransferase (FMT) transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. This family also includes formyltetrahydrofolate dehydrogenases, which produce formate from formyl-tetrahydrofolate. These enzymes contain an N-terminal domain in common with other formyl transferase enzymes. The C-terminal domain has an open beta-barrel fold.

    The C-terminal domain of FMT structurally resembles methylpurine-DNA glycosylases (MPG). Human 3-methyladenine DNA glycosylase (AAG) catalyses the first step of base excision repair by cleaving damaged bases from DNA, excising a chemically diverse selection of substrate bases damaged by alkylation or deamination.

    Proteins where this domain is known:
    PF14_0639   


    SSF50494 - Pept_Ser_Cys (Superfamily link)

    Interpro entry IPR009003 : Peptidase, trypsin-like serine and cysteine (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Cysteine peptidases have characteristic molecular topologies, which can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. These are peptidases in which the nucleophile is the sulphydryl group of a cysteine residue. Cysteine proteases are divided into clans (proteins which are evolutionary related), and further sub-divided into families, on the basis of the architecture of their catalytic dyad or triad.

    This signature recognises a large group of serine and cysteine peptidases (including prokaryotic, eukaryotic and viral), which share a common closed beta barrel structure.

    Proteins where this domain is known:
    MAL8P1.126    MAL8P1.98   


    SSF50615 - ATPase_a/b_N (Superfamily link)

    Interpro entry IPR018118 : ATPase, F1/A1 complex, alpha/beta subunit, N-terminal (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    This entry represents the alpha and beta subunits found in the F1, and A1 complexes of F- and A-ATPases, respectively (sometimes called the A and B subunits in V- and A-ATPases), as well as the alpha subunit from certain V1-ATPasea. The F-ATPases (or F1F0-ATPases), V-ATPases (or V1V0-ATPases) and A-ATPases (or A1A0-ATPases) are composed of two linked complexes: the F1, V1 or A1 complex contains the catalytic core that synthesizes/hydrolyses ATP, and the F0, V0 or A0 complex that forms the membrane-spanning pore. The F-, V- and A-ATPases all contain rotary motors, one that drives proton translocation across the membrane and one that drives ATP synthesis/hydrolysis .

    In F-ATPases, there are three copies each of the alpha and beta subunits that form the catalytic core of the F1 complex, while the remaining F1 subunits (gamma, delta, epsilon) form part of the stalks. There is a substrate-binding site on each of the alpha and beta subunits, those on the beta subunits being catalytic, while those on the alpha subunits are regulatory. The alpha and beta subunits form a cylinder that is attached to the central stalk. The alpha/beta subunits undergo a sequence of conformational changes leading to the formation of ATP from ADP, which are induced by the rotation of the gamma subunit, itself is driven by the movement of protons through the F0 complex C subunit.

    In V- and A-ATPases, the alpha/A and beta/B subunits of the V1 or A1 complex are homologous to the alpha and beta subunits in the F1 complex of F-ATPases, except that the alpha subunit is catalytic and the beta subunit is regulatory.

    The alpha/A and beta/B subunits can each be divided into three regions, or domains, centred around the ATP-binding pocket, and based on structure and function, where the central region is the nucleotide-binding domain. This entry represents the N-terminal domain of the alpha/A/beta/B subunits, which forms a closed beta-barrel with Greek-key topology.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0065    PFB0795w    PFL1725w   


    SSF50621 - Racem_decarbox_C (Superfamily link)

    Interpro entry IPR009006 : Alanine racemase/group IV decarboxylase, C-terminal (Interpro link)

    Interpro description:

    This entry represents a beta-barrel domain found at the C-terminal of alanine racemase and in group IV pyridoxal-5'-phosphate (PLP)-dependent decarboxylases, such as eukaryotic ornithine decarboxylase, arginine decarboxylase and diaminopimelate decarboxylase. These enzymes belong to the same structural family.

    Alanine racemase plays a role in providing the D-alanine required for cell wall biosynthesis by isomerising L-alanine to D-alanine. Proteins contains this domain are found in both prokaryotic and eukaryotic proteins. The molecular structure of alanine racemase from Bacillus stearothermophilus (Geobacillus stearothermophilus) was determined by X-ray crystallography to a resolution of 1.9 A. The alanine racemase monomer is composed of two domains, an eight-stranded alpha/beta barrel at the N-terminus, and a C-terminal domain essentially composed of beta-strand. The pyridoxal 5'-phosphate (PLP) cofactor lies in and above the mouth of the alpha/beta barrel and is covalently linked via an aldimine linkage to a lysine residue, which is at the C-terminus of the first beta-strand of the alpha/beta barrel.

    Eukaryotic ornithine decarboxylase (ODC) acts as a homodimer to produce putrescine (1,4-diaminobutane) from ornithine, where putrescine is the precursor of other polyamines in animals, plants, and bacteria. Arginine decarboxylase is also involved in putrescine biosynthesis. This is the first committed step in polyamine biosynthesis. Alanine racemase is a structurally homologous enzyme. Both proteins share a common alpha/beta barrel that binds the cofactor via a Schiff base on the C-terminal end of the barrel.

    Diaminopimelate decarboxylase (DapDC) catalyzes the final step of lysine biosynthesis in bacteria.

    Proteins where this domain is known:
    PF10_0322   


    SSF50630 - Pept_Aspartic (Superfamily link)

    Interpro entry IPR009007 : Peptidase aspartic, catalytic (Interpro link)

    Interpro description:

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    Aspartic endopeptidases of vertebrate, fungal and retroviral origin have been characterised. More recently, aspartic endopeptidases associated with the processing of bacterial type 4 prepilin and archaean preflagellin have been described.

    Structurally, aspartic endopeptidases are bilobal enzymes, each lobe contributing a catalytic Asp residue, with an extended active site cleft localised between the two lobes of the molecule. One lobe has probably evolved from the other through a gene duplication event in the distant past. In modern-day enzymes, although the three-dimensional structures are very similar, the amino acid sequences are more divergent, except for the catalytic site motif, which is very conserved. The presence and position of disulphide bridges are other conserved features of aspartic peptidases. All or most aspartate peptidases are endopeptidases. These enzymes have been assigned into clans (proteins which are evolutionary related), and further sub-divided into families, largely on the basis of their tertiary structure.

    These aspartate proteases all contain a common closed beta barrel structure, which includes pepsin, cathepsin, chymosin, beta-secretase, plasmepsin, plant acid proteases and retroviral proteases.

    Proteins where this domain is known:
    PF08_0108    PF10_0177    PF10_0329    PF13_0133    PF14_0075    PF14_0076    PF14_0077    PF14_0078    PF14_0090    PF14_0281    PF14_0625    PFC0495w    PFL1660c   


    SSF50677 - ValRS_IleRS_edit (Superfamily link)

    Interpro entry IPR009008 : Valyl/Leucyl/Isoleucyl-tRNA synthetase, class Ia, editing (Interpro link)

    Interpro description:

    Certain aminoacyl-tRNA synthetases prevent potential errors in protein synthesis through deacylation of mischarged tRNAs. The close homologs isoleucyl-tRNA synthetase (IleRS) and valyl-tRNA synthetase (ValRS) deacylate Val-tRNAIle and Thr-tRNAVal, respectively. These reactions strictly require the presence of the cognate tRNA. In the absence of tRNA, the enzymatically generated misactivated adenylates remain in the active site, sequestered from hydrolysis. Upon addition of cognate tRNA the misactivated amino acids are hydrolyzed, regenerating the free tRNA and amino acid, while converting 1 equivalent of ATP to AMP. A prominent mechanism for editing misactivated amino acids is the rapid hydrolysis of transiently mischarged tRNA. This reaction is catalyzed at a second active site on IleRS and ValRS. This site is located within a large insertion (termed CP1) into the canonical class I aminoacyl-tRNA synthetase active-site fold. The CP1 domain as an isolated polypeptide hydrolyzes its cognate mischarged tRNA.

    Proteins where this domain is known:
    PF13_0179    PF14_0589    PFF1095w    PFL1210w   


    SSF50692 - Asp_decarb_fold (Superfamily link)

    Interpro entry IPR009010 : Aspartate decarboxylase-like fold (Interpro link)

    Interpro description:

    Beta barrels are commonly observed in protein structures. They are classified in terms of two integral parameters: the number of strands in the sheet, n, and the shear number, S, a measure of the stagger of the strands in the beta-sheet. These two parameters have been shown to determine the major geometrical features of beta-barrels. Six-stranded beta-barrels with a pseudo-twofold axis are found in several proteins. One involving parallel strands forming two psi structures is known as the double-psi barrel. The first psi structure consists of the loop connecting strands beta1 and beta2 (a 'psi loop') and the strand beta5, whereas the second psi structure consists of the loop connecting strands beta4 and beta5 and the strand beta2. All the psi structures in double-psi barrels have a unique handedness, in that beta1 (beta4), beta2 (beta5) and the loop following beta5 (beta2) form a right-handed helix. The unique handedness may be related to the fact that the twisting angle between the parallel pair of strands is always larger than that between the antiparallel pair.

    In many cases, including aspartate decarboxylase and aspartic proteinases, strands 1 and 4 are each bent and consist of two sections. The two sections normally make a right angle; sometimes their hydrogen-bond patterns are disrupted at the corner by a bulge or even by a large insertion. In these cases, the barrel can also be viewed as a pair of orthogonally packed sheets, each with four strands.

    Proteins where this domain is known:
    PF07_0047    PFC0140c    PFF0940c   


    SSF50715 - Ribosomal_L25rel (Superfamily link)

    Interpro entry IPR011035 : Ribosomal protein L25/Gln-tRNA synthetase, anti-codon-binding (Interpro link)

    Interpro description:

    The bacterial ribosomal protein L25 is bound to 5S rRNA along with L5 and L18, forming a separate domain of the ribosome. The solution structure of protein L25 uncomplexed with RNA shows two significantly disordered loops and a closed beta-barrel domain with a complex topology that has significant structural similarities to the N-terminal domain of the Thermus thermophilus ribosomal protein TL5, to the general stress protein CTC, and to the C-terminal anticodon-binding domain of Escherichia coli glutaminyl-tRNA synthetase (GlnRS). GlnRS contains a duplication consisting of two L25-like beta-barrels domains with the swapping of N-terminal strands.

    Proteins where this domain is known:
    PF13_0170    PF13_0257    PFD0335c   


    SSF50729 - SSF50729 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.188    MAL13P1.256    MAL13P1.306    PF10_0132    PF10_0189    PF10_0314    PF11_0242    PF11_0252    PF11_0327    PFB0257c    PFD0207c    PFD0950w   


    SSF50784 - TFIIA_betabarrel (Superfamily link)

    Interpro entry IPR009088 : Transcription factor IIA, beta-barrel (Interpro link)

    Interpro description:

    Transcription factor IIA (TFIIA) is one of several factors that form part of a transcription pre-initiation complex along with RNA polymerase II, the TATA-box-binding protein (TBP) and TBP-associated factors, on the TATA-box sequence upstream of the initiation start site. After initiation, some components of the pre-initiation complex (including TFIIA) remain attached and re-initiate a subsequent round of transcription. TFIIA binds to TBP to stabilise TBP binding to the TATA element. TFIIA also inhibits the cytokine HMGB1 (high mobility group 1 protein) binding to TBP, and can dissociate HMGB1 already bound to TBP/TATA-box.

    Human and Drosophila TFIIA have three subunits: two large subunits, LN/alpha and LC/beta, derived from the same gene, and a small subunit, S/gamma. Yeast TFIIA has two subunits: a large TOA1 subunit that shows sequence similarity to the N-terminal of LN/alpha and the C-terminal of LC/beta, and a small subunit, TOA2 that is highly homologous with S/gamma. The conserved regions of the large and small subunits of TFIIA combine to form two domains: a four-helix bundle (helical domain) composed of two helices from each of the N-terminal regions of TOA1 and TOA2 in yeast; and a beta-barrel (beta-barrel domain) composed of beta-sheets from the C-terminal regions of TOA1 and TOA2.

    This entry represents the beta-barrel domain found at the C-terminal of both TOA1 (or alpha/beta) and TOA2 (or gamma) subunits of TFIIA, and their homologues.

    Proteins where this domain is known:
    MAL7P1.78   


    SSF50800 - PK_B_barrel_like (Superfamily link)

    Interpro entry IPR011037 : (Interpro link)

    Interpro description:

    Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

     ADP + phosphoenolpyruvate = ATP + pyruvate 

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    This entry represents the beta-barrel domain (note: it does not include the beta/alpha-barrel it is inserted into). This domain has a similar topology to the beta-strand-rich C-terminal domain of molybdenum cofactor (MOCO) sulphurase (MOSC domain). MOSC domains are found alone in bacterial YiiM proteins, or fused to other domains, such as a NifS-like catalytic domain in MOCO sulphurase. The MOSC domain is predicted to be a sulphur-carrier domain that receives sulphur abstracted from pyridoxal phosphate-dependent NifS-like enzymes, using it for the formation of diverse sulphur-metal clusters.

    Proteins where this domain is known:
    PF10_0363    PFF1300w   


    SSF50891 - CSA_PPIase (Superfamily link)

    Interpro entry IPR015891 : (Interpro link)

    Interpro description:

    This entry represents the core beta-barrel (8,10) domain found in cyclophilin (peptidylprolyl isomerise). This domain is related to a beta-barrel domain found in several outer membrane proteins, usually at the C-terminus; in these proteins, the beta-barrel (7,10) lacks the N-terminal strand of the cyclophilin domain, but remains closed.

    Cyclophilin is the major high-affinity binding protein in vertebrates for the immunosuppressive drug cyclosporin A (CSA), but is also found in other organisms. It exhibits a peptidyl-prolyl cis-trans isomerase activity (PPIase or rotamase). PPIase is an enzyme that accelerates protein folding by catalyzing the cis-trans isomerization of proline imidic peptide bonds in oligopeptides. It is probable that CSA mediates some of its effects via an forming a tight complex with cyclophilin that inhibits the phosphatase activity of calcineurin. Cyclophilin A is a cytosolic and highly abundant protein. The protein belongs to a family of isozymes, including cyclophilins B and C, and natural killer cell cyclophilin-related protein. Major isoforms have been found throughout the cell, including the ER, and some are even secreted. The sequences of the different forms of cyclophilin-type PPIases are well conserved.

  • Note: FKBP's, a family of proteins that bind the immunosuppressive drug FK506, are also PPIases, but their sequence is not at all related to that of cyclophilin.
  • Proteins where this domain is known:
    PF08_0121    PF08_0128    PF11_0164    PF11_0170    PF14_0223    PFC0975c    PFE0505w    PFE1430c    PFI1490c    PFL0120c    PFL0735w   


    SSF50911 - Man6php_recept (Superfamily link)

    Interpro entry IPR009011 : Mannose-6-phosphate receptor, binding (Interpro link)

    Interpro description:

    Mannose-6-phosphate receptors (MPRs) are transmembrane proteins involved in the transport of lysosomal enzymes from the Golgi complex and the cell surface to lysosomes. Lysosomal enzymes bearing phosphomannosyl residues bind specifically to MPRs in the Golgi apparatus and the resulting receptor-ligand complex is transported to an acidic prelysosomal compartment, where the low pH mediates dissociation of the complex. There are two distinct MPRs that function in the recognition of mannose-6-phosphate-containing proteins: the cation-dependent MPR (CD-MPR) and the cation-independent MPR (CI-MPR). The CI-MPR is also known as the insulin-like growth factor II receptor, a multi-functional protein implicated in tumour suppression.

    The crystal structure of the N-terminal, extracytoplasmic, receptor-binding domain of bovine CD-MPR (excluding the signal sequence) reveals structural similarity to the fifteen homologous, repeating domains comprising the extracellular region of human CI-MPR. The structure consists of a partly opened, nine-stranded, beta-barrel.

    Proteins where this domain is known:
    PF07_0035   


    SSF50965 - Gal_oxid_central (Superfamily link)

    Interpro entry IPR011043 : (Interpro link)

    Interpro description:

    This entry represents a beta-propeller domain found in galactose oxidase and in Kelch repeat-containing proteins.

    The known functions of kelch-containing proteins are diverse: scruin is an actin cross-linking protein; galactose oxidase catalyses the oxidation of the hydroxyl group at the C6 position in D-galactose; neuraminidase hydrolyses sialic acid residues from glycoproteins; and kelch may have a cytoskeletal function, as it is localised to the actin-rich ring canals that connect the 15 nurse cells to the developing oocyte in Drosophila. Nevertheless, based on the location of the kelch pattern in the catalytic unit in galactose oxidase, functionally important residues have been predicted in glyoxal oxidase.

    Galactose oxidase is a monomeric enzyme that contains a single copper ion and catalyses the stereospecific oxidation of primary alcohols to their corresponding aldehyde. The protein contains an unusual covalent thioether bond between a tyrosine and a cysteine that forms during its maturation. Galactose oxidase is a three-domain protein: the N-terminal domain forms a jelly-roll sandwich, the central domain forms a seven 4-bladed beta-propeller, and the C-terminal domain has an immunoglobulin-like fold.

    Proteins where this domain is known:
    MAL7P1.137    PF10_0179    PF10_0219    PF11_0240    PF11_0267    PF11_0268    PF13_0238    PF14_0630    PF14_0649    PFL0270c    PFL0530c    PFL0650c   


    SSF50969 - Amine_DH_B_like (Superfamily link)

    Interpro entry IPR011044 : (Interpro link)

    Interpro description:

    Quinohemoprotein amine dehydrogenase (QHNDH) from Paracoccus denitrificans is a heterotrimer consisting of alpha, beta and gamma chains. The alpha chain has a four-domain structure that includes a dihaem cytochrome c, the beta chain forms a 7-bladed beta-propeller that is part of the enzyme active site, and the gamma chain contains the redox factor cysteine tryptophylquinone (CTQ).

    The beta chain of QHNDH structurally resembles the 7-bladed beta propeller of the H chain of the periplasmic quinoprotein methylamine dehydrogenase (MADH), found in methylotrophic bacteria. MADH is a heterotetramer consisting of two heavy (H) chains and two light (L) chains, and contains the redox cofactor tryptophan tryptophylquinone (TTQ). There is no similarity between the quinone-containing chains of MAD and QHNDH.

    The beta-propeller structure found in MAD and QHNDH is similar to the YVTN (Tyr-Val-Thr-Asn) repeat that folds into a beta-propeller found in the N-terminal domain of archaeal surface layer proteins, which help protect cells from extreme environments.

    Proteins where this domain is known:
    PF08_0130   


    SSF50978 - WD40_like (Superfamily link)

    Interpro entry IPR011046 : (Interpro link)

    Interpro description:

    WD-40 repeats (also known as WD or beta-transducin repeats) are short ~40 amino acid motifs, often terminating in a Trp-Asp (W-D) dipeptide. WD40 repeats usually assume a 7-8 bladed beta-propeller fold, but proteins have been found with 4 to 16 repeated units, which also form a circularised beta-propeller structure. WD-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control and apoptosis. Repeated WD40 motifs act as a site for protein-protein interaction, and proteins containing WD40 repeats are known to serve as platforms for the assembly of protein complexes or mediators of transient interplay among other proteins. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. In Arabidopsis spp., several WD40-containing proteins act as key regulators of plant-specific developmental events.

    The structures of several WD40 repeat-containing proteins have been determined, including the beta-1 subunit of the signal-transducing G protein heterotrimer, the C-terminal domain of yeast Tup1, the C-terminal domain of Groucho/tle1, the Cdc4 propeller domain, the bovine Arp2/3 complex 41 kDa subunit ARPC1, and actin interacting protein 1.

    Proteins where this domain is known:
    MAL13P1.142    MAL13P1.148    MAL13P1.245    MAL13P1.264    MAL13P1.385    MAL13P1.54    MAL13P1.79    MAL7P1.81    MAL8P1.145    MAL8P1.43    PF07_0017    PF07_0092    PF07_0106    PF08_0019    PF08_0065    PF08_0130    PF08_0135    PF10_0044    PF10_0045    PF10_0126    PF10_0128    PF10_0183    PF10_0196    PF10_0261    PF10_0285    PF10_0326    PF11_0056    PF11_0171    PF11_0195    PF11_0222    PF11_0252    PF11_0275    PF11_0400    PF11_0471    PF13_0149    PF13_0184    PF13_0250    PF13_0309    PF13_0335    PF14_0055    PF14_0062    PF14_0087    PF14_0101    PF14_0243    PF14_0262    PF14_0263    PF14_0314    PF14_0412    PF14_0456    PF14_0565    PF14_0640    PF14_0648    PFA0520c    PFB0640c    PFB0700c    PFB0755w    PFC0100c    PFC0365w    PFC0965w    PFD0455w    PFE0090w    PFE0505w    PFE0540w    PFE0860c    PFE0930w    PFE1200w    PFE1270c    PFE1310c    PFF0325c    PFF0330w    PFF0395c    PFF1000w    PFF1480w    PFI0275w    PFI0290c    PFI1080w    PFL0470w    PFL0610w    PFL0920c    PFL0970w    PFL1040w    PFL1290w    PFL1395c    PFL1470c    PFL1480w    PFL1680w    PFL1820w    PFL1975c    PFL2105c    PFL2460w   


    SSF50985 - RCC1/BLIP-II (Superfamily link)

    Interpro entry IPR009091 : (Interpro link)

    Interpro description:

    The beta-lactamase-inhibitor protein II (BLIP-II) is a secreted protein produced by the soil bacteria Streptomyces exfoliates SMF19. BLIP-II acts as a potent inhibitor of beta-lactamases such as TEM-1, which is the most widespread resistance enzyme to penicillin antibiotics. BLIP-II binds competitively to TEM-1, but no direct contacts are made with TEM-1 active site residues. BLIP-II shows no sequence similarity with BLIP, even though both bind to and inhibit TEM-1. However, BLIP-II does share significant sequence identity with the regulator of chromosome condensation (RCC1) family of proteins. These two families are clearly related, both having a seven-bladed beta-propeller structure, although they differ in the number of strands per blade, BLIP-II having three antiparallel beta-strands per blade, while RCC1 has four-stranded blades. RCC1 is a eukaryotic nuclear protein that acts as a guanine nucleotide exchange factor for Ran, a member of the Ras GTPase family. RCC1 mediates a Ran-GTP gradient necessary for the regulation of spindle formation and nuclear assembly during mitosis, as well as for the transport of macromolecules across the nuclear membrane during interphase.

    Proteins where this domain is known:
    MAL7P1.38    PF11_0385    PF11_0448    PF13_0303    PFD0145c    PFD0900w    PFE0420c    PFI0975c    PFI1500w    PFL0975w   


    SSF50989 - Clathrin_H-chain_propeller_N (Superfamily link)

    Interpro entry IPR001473 : Clathrin, heavy chain, propeller, N-terminal (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. These vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transport. Clathrin coats contain both clathrin (acts as a scaffold) and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors.

    Clathrin is a trimer composed of three heavy chains and three light chains, each monomer projecting outwards like a leg; this three-legged structure is known as a triskelion. The heavy chains form the legs, their N-terminal beta-propeller regions extending outwards, while their C-terminal alpha-alpha-superhelical regions form the central hub of the triskelion. Peptide motifs can bind between the beta-propeller blades. The light chains appear to have a regulatory role, and may help orient the assembly and disassembly of clathrin coats as they interact with hsc70 uncoating ATPase. Clathrin triskelia self-polymerise into a curved lattice by twisting individual legs together. The clathrin lattice forms around a vesicle as it buds from the TGN, plasma membrane or endosomes, acting to stabilise the vesicle and facilitate the budding process. The multiple blades created when the triskelia polymerise are involved in multiple protein interactions, enabling the recruitment of different cargo adaptors and membrane attachment proteins.

    This entry represents the N-terminal beta-propeller region of clathrin heavy chains that extends away from the hub of triskelia, and which are responsible for peptide binding.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PFL0930w   


    SSF51004 - Cyt_cd1_haem_C (Superfamily link)

    Interpro entry IPR011048 : (Interpro link)

    Interpro description:

    Cytochrome cd1 (cyt cd1) nitrite reductase is a dimeric enzyme of the bacterial periplasm that plays a key role in denitrification, the respiratory reduction of nitrite to nitric oxide in the nitrogen cycle. Each subunit of the cyt cd1 dimer contains one cytochrome c and one d1 haem group. The active site contains a specialised d1 haem, where the nitrite substrate is bound and reduced. This d1 haem is bound in an 8-bladed beta-propeller, which is also found in some members of the WD40 repeat-containing proteins.

    Proteins where this domain is known:
    PFI0290c    PFL1175w   


    SSF51045 - WW_Rsp5_WWP (Superfamily link)

    Interpro entry IPR001202 : WW/Rsp5/WWP (Interpro link)

    Interpro description:

    Synonym(s): Rsp5 or WWP domain

    The WW domain is a short conserved region in a number of unrelated proteins, which folds as a stable, triple stranded beta-sheet. This short domain of approximately 40 amino acids, may be repeated up to four times in some proteins. The name WW or WWP derives from the presence of two signature tryptophan residues that are spaced 20-23 amino acids apart and are present in most WW domains known to date, as well as that of a conserved Pro. The WW domain binds to proteins with particular proline-motifs, [AP]-P-P-[AP]-Y, and/or phosphoserine- phosphothreonine-containing motifs. It is frequently associated with other domains typical for proteins in signal transduction processes.

    A large variety of proteins containing the WW domain are known. These include; dystrophin, a multidomain cytoskeletal protein; utrophin, a dystrophin-like protein of unknown function; vertebrate YAP protein, substrate of an unknown serine kinase; Mus musculus (Mouse) NEDD-4, involved in the embryonic development and differentiation of the central nervous system; Saccharomyces cerevisiae (Baker's yeast) RSP5, similar to NEDD-4 in its molecular organization; Rattus norvegicus (Rat) FE65, a transcription-factor activator expressed preferentially in liver; Nicotiana tabacum (Common tobacco) DB10 protein and others.

    Proteins where this domain is known:
    PF13_0091    PFL1745c   


    SSF51064 - GrpE_head (Superfamily link)

    Interpro entry IPR009012 : GrpE nucleotide exchange factor, head (Interpro link)

    Interpro description:

    In prokaryotes, the nucleotide exchange factor GrpE and the chaperone DnaJ are required for nucleotide binding of the molecular chaperone DnaK. The DnaK reaction cycle involves rapid peptide binding and release, which is dependent upon nucleotide binding. DnaJ accelerates the hydrolysis of ATP by DnaK, which enables the ADP-bound DnaK to tightly bind peptide. GrpE catalyses the release of ADP from DnaK, which is required for peptide release. In eukaryotes, GrpE is essential for mitochondrial Hsp70 function, however the cytosolic Hsp70 homologues are GrpE-independent.

    GrpE binds as a homodimer to the ATPase domain of DnaK, and may interact with the peptide-binding domain of DnaK. GrpE accomplishes nucleotide exchange by opening the nucleotide-binding cleft of DnaK. GrpE is comprised of two domains, the N-terminal coiled coil domain, which may facilitate peptide release, and the C-terminal head domain, which forms part of the contact surface with the ATPase domain of DnaK. The head domain is comprised of six short beta strands with a limited hydrophobic core.

    Proteins where this domain is known:
    PF11_0258   


    SSF51069 - Euk_COanhd (Superfamily link)

    Interpro entry IPR001148 : (Interpro link)

    Interpro description:

    Carbonic anhydrases (CA: are zinc metalloenzymes which catalyse the reversible hydration of carbon dioxide to bicarbonate. CAs have essential roles in facilitating the transport of carbon dioxide and protons in the intracellular space, across biological membranes and in the layers of the extracellular space; they are also involved in many other processes, from respiration and photosynthesis in eukaryotes to cyanate degradation in prokaryotes. There are five known evolutionarily distinct CA families (alpha, beta, gamma, delta and epsilon) that have no significant sequence identity and have structurally distinct overall folds. Some CAs are membrane-bound, while others act in the cytosol; there are several related proteins that lack enzymatic activity. The active site of alpha-CAs is well described, consisting of a zinc ion coordinated through 3 histidine residues and a water molecule/hydroxide ion that acts as a potent nucleophile. The enzyme employs a two-step mechanism: in the first step, there is a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide; in the second step, the active site is regenerated by the ionisation of the zinc-bound water molecule and the removal of a proton from the active site. Beta- and gamma-CAs also employ a zinc hydroxide mechanism, although at least some beta-class enzymes do not have water directly coordinated to the metal ion.

    This entry represents alpha class carbonic anhydrases.

    More information about these proteins can be found at Protein of the Month: Carbonic Anhydrase.

    Proteins where this domain is known:
    PF11_0410    PF11_0411   


    SSF51120 - Serralysn_like_C (Superfamily link)

    Interpro entry IPR011049 : (Interpro link)

    Interpro description:

    Serralysin is a bacterial Zn-endopeptidase that acts as a virulence factor to cause tissue damage and anaphylactic response. Many Zn-endopeptidases contain the metal binding motif HexxHxxGxxH; in addition to these coordinated histidine residues, serralysin contains a coordinated tyrosine residue that is unique to the astacin-like Zn enzymes. The Zn-endopeptidases containing the histidine motif are structurally similar to one another, containing an N-terminal catalytic domain that belongs to the zincin family, and a C-terminal beta-helix metal-binding domain. These peptidase include the astacin family, snake venom Zn-endopeptidases, the extracellular metalloproteases from Serratia sp., Pseudomonas sp. and Erwinia sp., and the matrixins.

    Proteins where this domain is known:
    PFB0675w   


    SSF51161 - Trimer_LpxA_like (Superfamily link)

    Interpro entry IPR011004 : Trimeric LpxA-like (Interpro link)

    Interpro description:

    This domain is characterised by trimeric LpxA-like enzymes that display a single-stranded left-handed beta-helix fold, composed of tandem repeats of a hexapeptide, as represented by the Bacterial transferase hexapeptide repeat, where the hexapeptide repeats correspond to individual strands. Many bacterial transferases contain this domain. The structures of several proteins with this domain have been determined, including UDP N-acetylglucosamine acyltransferase (LpxA) from Escherichia coli, the first enzyme in the lipid A biosynthetic pathway; galactoside acetyltransferase (GAT, LacA) from E. coli, a gene product of the lac operon that may assist cellular detoxification; gamma-class Archaeon carbonic anhydrase, a zinc-containing enzyme that catalyses the reversible hydration of carbon dioxide; tetrahydrodipicolinate-N-succinlytransferase (DapD) from Mycobacterium bovis, an enzyme from the lysine biosynthetic pathway that contains an extra N-terminal 3-helical domain; and the C-terminal domain of N-acetylglucosamine 1-phosphate uridyltransferase (GlmU) from E. coli, a trimeric bifunctional enzyme that catalyses the last two sequential reactions in the de novo biosynthetic pathway for UDP-N-acetylglucosamine, an essential precursor for many biomolecules.

    Proteins where this domain is known:
    MAL8P1.68    PF07_0098    PF11_0460    PF14_0774    PFB0190c    PFC0860w    PFD0260c    PFE0240c    PFE1325w    PFL0675c    PFL1565c   


    SSF51182 - RmlC_like_cupin (Superfamily link)

    Interpro entry IPR011051 : (Interpro link)

    Interpro description:

    RmlC (dTDP (deoxythimodone diphosphates)-4-dehydrorhamnose 3,5-epimerase; is a dTDP-sugar isomerase enzyme involved in the synthesis of L-rhamnose, a saccharide required for the virulence of some pathogenic bacteria. RmlC is a dimer, each monomer being formed from two beta-sheets arranged in a beta-sandwich, where the substrate-binding site is located between the two sheets of both monomers.

    Other protein families contain domains that share this fold, including glucose-6-phosphate isomerase; germin, a metal-binding protein with oxalate oxidase and superoxide dismutases activities; auxin-binding protein; seed storage protein 7S; acireductone dioxygenase; as well as three proteins that have metal-binding sites similar to that of germine, namely quercetin 2,3-dioxygenase, phosphomannose isomerase and homogentisate dioxygenase, the last three sharing a 2-domain fold with storage protein 7s.

    Proteins where this domain is known:
    MAL8P1.156   


    SSF51197 - SSF51197 (Superfamily link)

    Proteins where this domain is known:
    PF11_0230    PFF0135w   


    SSF51206 - cNMP_binding (Superfamily link)

    Interpro entry IPR018490 : (Interpro link)

    Interpro description:
    Proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues. The best studied of these proteins is the prokaryotic catabolite gene activator (also known as the cAMP receptor protein) (gene crp) where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure. There are six invariant amino acids in this domain, three of which are glycine residues that are thought to be essential for maintenance of the structural integrity of the beta-barrel. cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain. The cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain. The cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section. Vertebrate cyclic nucleotide-gated ion-channels also contain this domain. Two such cations channels have been fully characterised, one is found in rod cells where it plays a role in visual signal transduction.

    Proteins where this domain is known:
    PF14_0172    PF14_0173    PF14_0346    PFI1560c    PFL1110c   


    SSF51230 - Hybrid_motif (Superfamily link)

    Interpro entry IPR011053 : (Interpro link)

    Interpro description:

    The single hybrid motif has a beta-barrel sandwich hybrid fold, consisting of a sandwich of half-barrel shaped beta-sheets. This motif is found in biotinyl/lipoyl-carrier proteins and domains, where the biotin and lipoic acid moieties act as covalently attached coenzyme cofactors in enzymes that catalyse metabolic reactions. For example, this motif can be found in the biotinyl domain of Escherichia coli acetyl-CoA carboxylase, protein H of the glycine cleavage system in Pisum sativum (Garden pea), the ipoyl domain of dihydrolipoamide acetyltransferase, which is a component of the pyruvate dehydrogenase complex, the lipoyl domain of the 2-oxoglutarate dehydrogenase complex, and the lipoyl domain f the mitochondrial branched-chain alpha-ketoacid dehydrogenase.

    Proteins where this domain is known:
    PF10_0407    PF11_0339    PF13_0017    PF13_0121    PF14_0664    PFC0170c   


    SSF51246 - Rudmnt_hyb_motif (Superfamily link)

    Interpro entry IPR011054 : (Interpro link)

    Interpro description:

    The rudiment single hybrid motif has a beta-barrel sandwich hybrid motif, consisting of a sandwich of half-barrel shaped beta-sheets. This motif is found in the small domain of cytochrome f, as well as in the C-terminal domain of the biotin carboxylase subunit of acetyl-CoA carboxylase, and its family members, such as glycinamide ribonucleotide synthetase C-terminal domain, N5-carboxyaminoimidazole ribonucleotide synthetase PurK C-terminal domain, and glycinamide ribonucleotide transformylase PurT C-terminal domain.

    Proteins where this domain is known:
    PF14_0664   


    SSF51283 - SSF51283 (Superfamily link)

    Proteins where this domain is known:
    PF11_0282   


    SSF51306 - Pept_S24_S26_C (Superfamily link)

    Interpro entry IPR015927 : (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This signature is associated with serine peptidases belong to MEROPS peptidase families: S24 (LexA family, clan SF); S26A (signal peptidase I), S26B (signalase) and S26C TraF peptidase.

    The S26 family includes Escherichia coli signal peptidase, SPase, which is a membrane-bound endopeptidase, with two N-terminal transmembrane segments and a C-terminal catalytic region. SPase functions to release proteins that have been translocated into the inner membrane from the cell interior, by cleaving off their signal peptides.

    The S24 family includes:

    All of these proteins, with the possible exception of RulA, interact with RecA, which activates self cleavage either derepressing transcription in the case of CI and LexA or activating the lesion-bypass polymerase in the case of UmuD and MucA. UmuD'2, is the homodimeric component of DNA pol V, which is produced from UmuD by RecA-facilitated self-cleavage. The first 24 N-terminal residues of UmuD are removed; UmuD'2 is a DNA lesion bypass polymerase. MucA, like UmuD, is a plasmid encoded a DNA polymerase (pol RI) which is converted into the active lesion-bypass polymerase by a self-cleavage reaction involving RecA

    This group of proteins also contains proteins not recognised as peptidases as well as those classified as non-peptidase homologues as they either have been found experimentally to be without peptidase activity, or lack amino acid residues that are believed to be essential for catalytic activity.

    Proteins where this domain is known:
    MAL13P1.167    PF13_0118   


    SSF51316 - Mss4_like (Superfamily link)

    Interpro entry IPR011057 : (Interpro link)

    Interpro description:

    This entry represents a structural domain with a complex fold consisting of several coiled beta-sheets. This domain exists as a duplication, consisting of a tandem repeat of two similar structural motifs. These domains can be found in:

    Mss4 is a conserved accessory factor for Rab GTPases, which function as ubiquitous regulators of intracellular membrane trafficking. Mss4 acts to promote nucleotide release from exocytic but not endocytic Rab GTPases. Mss4 has a complex fold made of several coiled beta-sheets, and consists of a duplication of tandem repeats of two similar structural motifs. It contains a zinc-binding site.

    Other proteins that show structure similarity to Mss4 include the translationally controlled tumour-associated proteins TCTPs, which contains an insertion of an alpha helical hairpin, and lacks the zinc-binding site. TCTPs are a highly conserved and abundantly expressed family of eukaryotic proteins that are implicated in both cell growth and the human acute allergic response.

    The C-terminal MsrB domain of peptide methionine sulphoxide reductase PilB is structurally similar to Mss4. Methionine sulphoxide reductases protect against oxidative damage that can contribute to cell death. The tandem Msr domains (MsrA and MsrB) of the pilB protein from Neisseria gonorrhoeae each reduce different epimeric forms of methionine sulphoxide.

    Proteins where this domain is known:
    PFE0545c   


    SSF51344 - ATPsynt_DE (Superfamily link)

    Interpro entry IPR001469 : ATPase, F1 complex, delta/epsilon subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    This family represents subunits called delta (in mitochondrial ATPase) or epsilon (in bacteria or chloroplast ATPase). The interaction site of subunit C of the F0 complex with the delta or epsilon subunit of the F1 complex may be important for connecting the rotor of F1 (gamma subunit) to the rotor of F0 (C subunit). In bacterial species, the delta subunit is the equivalent of the Oligomycin sensitive subunit (OSCP) in metazoans. The C-terminal domain of the epsilon subunit appears to act as an inhibitor of ATPase activity.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF11_0485   


    SSF51351 - Triophos_ismrse (Superfamily link)

    Interpro entry IPR000652 : Triosephosphate isomerase (Interpro link)

    Interpro description:

    Triosephosphate isomerase (TIM) is the glycolytic enzyme that catalyses the reversible interconversion of glyceraldehyde 3-phosphate and dihydroxyacetone phosphate. TIM plays an important role in several metabolic pathways and is essential for efficient energy production. It is a dimer of identical subunits, each of which is made up of about 250 amino-acid residues. A glutamic acid residue is involved in the catalytic mechanism. The sequence around the active site residue is perfectly conserved in all known TIM's. Deficiencies in TIM are associated with haemolytic anaemia coupled with a progressive, severe neurological disorder.

    Proteins where this domain is known:
    PF14_0378    PFC0831w   


    SSF51366 - RibP_bind_barrel (Superfamily link)

    Interpro entry IPR011060 : Ribulose-phosphate binding barrel (Interpro link)

    Interpro description:

    The ribulose-phosphate binding barrel consists of a parallel beta-sheet barrel fold containing a phosphate-binding site. Several proteins display this fold, including histidine biosynthesis enzymes, tryptophan biosynthesis enzymes, D-ribulose-5-phosphate 3-epimerase, and decarboxylases.

    Proteins where this domain is known:
    MAL13P1.319    PF10_0225    PFF1025c    PFL0960w   


    SSF51391 - TMP_synthase (Superfamily link)

    Interpro entry IPR003733 : Thiamine monophosphate synthase (Interpro link)

    Interpro description:

    Thiamine monophosphate synthase (TMP) catalyzes the substitution of the pyrophosphate of 2-methyl-4-amino-5- hydroxymethylpyrimidine pyrophosphate by 4-methyl-5- (beta-hydroxyethyl)thiazole phosphate to yield thiamine phosphate in the thiamine biosynthesis pathway.

    TENI, a protein from Bacillus subtilis that regulates the production of several extracellular enzymes by reducing alkaline protease production belongs to this group.

    Proteins where this domain is known:
    PFF0680c   


    SSF51395 - FMN-linked oxidoreductases (Superfamily link)

    Proteins where this domain is known:
    PF14_0086    PF14_0334    PFF0160c    PFI0920c   


    SSF51412 - SSF51412 (Superfamily link)

    Proteins where this domain is known:
    PFI1020c   


    SSF51419 - SSF51419 (Superfamily link)

    Proteins where this domain is known:
    PF10_0322    PFI0965w   


    SSF51430 - Aldo/ket_red (Superfamily link)

    Interpro entry IPR001395 : Aldo/keto reductase (Interpro link)

    Interpro description:

    The aldo-keto reductase family includes a number of related monomeric NADPH-dependent oxidoreductases, such as aldehyde reductase, aldose reductase, prostaglandin F synthase, xylose reductase, rho crystallin, and many others. All possess a similar structure, with a beta-alpha-beta fold characteristic of nucleotide binding proteins. The fold comprises a parallel beta-8/alpha-8-barrel, which contains a novel NADP-binding motif. The binding site is located in a large, deep, elliptical pocket in the C-terminal end of the beta sheet, the substrate being bound in an extended conformation. The hydrophobic nature of the pocket favours aromatic and apolar substrates over highly polar ones.

    Binding of the NADPH coenzyme causes a massive conformational change, reorienting a loop, effectively locking the coenzyme in place. This binding is more similar to FAD- than to NAD(P)-binding oxidoreductases.

    Some proteins of this entry contain a K+ ion channel beta chain regulatory domain; these are reported to have oxidoreductase activity.

    Proteins where this domain is known:
    MAL13P1.324    PF14_0088   


    SSF51445 - Glyco_hydro_cat (Superfamily link)

    Interpro entry IPR017853 : (Interpro link)

    Interpro description:

    O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'.

    This entry represents the catalytic TIM beta/alpha barrel common to many different families of glycosyl hydrolases. Structures have been determined for several proteins containing this domain, including family 13 glycosyl hydrolases (such as alpha-amylase), beta-glycanases, family 1 glycosyl hydrolases (such as beta-glucosidase), type II chitinases, 1,4-beta-N-acetylmuraminidases, and beta-N-acetylhexosaminidases.

    More information about this protein can be found at Protein of the Month: alpha-Amylase.

    Proteins where this domain is known:
    MAL13P1.258    PFL2510w   


    SSF51556 - SSF51556 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.146    PF10_0289    PF14_0697    PFA0580c   


    SSF51569 - Aldolase (Superfamily link)

    Proteins where this domain is known:
    PF10_0210    PF14_0381    PF14_0425   


    SSF51604 - SSF51604 (Superfamily link)

    Proteins where this domain is known:
    PF10_0155   


    SSF51621 - Pyrv/PenolPyrv_Kinase_cat (Superfamily link)

    Interpro entry IPR015813 : Pyruvate/Phosphoenolpyruvate kinase, catalytic core (Interpro link)

    Interpro description:

    Pyruvate kinase controls the exit from the glysolysis pathway, catalysing the transfer of phosphate from phosphooenolpyruvate (PEP) to ADP. Mammalian pyruvate kinase is a homotetramer, where each polypeptide subunit consists of four domains: N-terminal, A domain, B domain and C-terminal. Activation of the enzyme is believed to occur via the clamping down of the B domain onto the A domain to dehydrate the active site cleft. The N- and C-terminal domains are situated at inter-subunit contact sites, and could be involved in assembly and communication within the complex. The N-terminal domain has a TIM beta/alpha-barrel structure. Homologous TIM-barrel domains are found in the following proteins:

    Proteins where this domain is known:
    PF10_0363    PF14_0246    PFF1300w   


    SSF51658 - Xyl_isomerase-like_TIM-brl (Superfamily link)

    Interpro entry IPR013022 : (Interpro link)

    Interpro description:

    This entry represents a structural motif with a beta/alpha TIM barrel found in several proteins families:

    These proteins share similar, but not identical, metal-binding sites. In addition, xylose isomerase and L-rhamnose isomerase each have additional alpha-helical domains involved in tetramer formation. This entry differs from IPR012307 in having a wider coverage of TIM-barrel protein families.

    Proteins where this domain is known:
    PF13_0176   


    SSF51695 - PLC-like_Pdiesterase_TIM-brl (Superfamily link)

    Interpro entry IPR017946 : PLC-like phosphodiesterase, TIM beta/alpha-barrel domain (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of a TIM beta/alpha-barrel. These domains are found in several phospholipase C (PLC) like phosphodiesterases, including:

    Phospholipase C (PLC) isozymes are directly activated by heterotrimeric G proteins and Ras-like GTPases to hydrolyze phosphatidylinositol 4,5-bisphosphate into the second messengers diacylglycerol and inositol 1,4,5-trisphosphate. PLC enzymes often play central roles in various signalling cascades.

    Proteins where this domain is known:
    PF10_0132    PF14_0060   


    SSF51713 - tRNA_ribo_trans (Superfamily link)

    Interpro entry IPR002616 : Queuine/other tRNA-ribosyltransferase (Interpro link)

    Interpro description:
    This is a family of queuine, archaeosine and general tRNA-ribosyltransferases also known as tRNA-guanine transglycosylase and guanine insertion enzyme. Queuine tRNA-ribosyltransferase modifies tRNAs for asparagine, aspartic acid, histidine and tyrosine with queuine at position 34 and with archaeosine at position 15 in archaeal tRNAs. In bacterial it catalyses the exchange of guanine-34 at the wobble position with 7-aminomethyl-7-deazaguanine, and the addition of a cyclopentenediol moiety to 7-aminomethyl-7-deazaguanine-34 tRNA; giving a hypermodified base queuine in the wobble position. The aligned region contains a zinc binding motif C-x-C-x2-C-x29-H, and important tRNA and 7-aminomethyl-7deazaguanine binding residues.

    Proteins where this domain is known:
    PF07_0071    PF14_0322    PFL2030w   


    SSF51717 - DHP_synth_like (Superfamily link)

    Interpro entry IPR011005 : Dihydropteroate synthase-like (Interpro link)

    Interpro description:

    All organisms require reduced folate cofactors for the synthesis of a variety of metabolites. The enzyme 7,8-dihydropteroate synthase (DHPS) catalyses the condensation of para-aminobenzoic acid (pABA) with 6-hydroxymethyl-7, 8-dihydropterin-pyrophosphate to form 7,8-dihydropteroate and pyrophosphate. DHPS is essential for the de novo synthesis of folate in prokaryotes, lower eukaryotes, and in plants, but is absent in mammals. By contrast, higher vertebrates possess an active transport system that enables them to use dietary folates. DHPS is the target of sulphonamides, which are substrate analogues that compete with pABA, but which do not affect vertebrates as they lack the DHPS enzyme. DHPS is a single domain protein that forms an eight-stranded TIM alpha/beta barrel, where the 7,8-dihydropterin pyrophosphate substrate binds in a deep cleft in the barrel. In the lower eukaryote Pneumocystis carinii, DHPS is the C-terminal domain of a multifunctional folate synthesis enzyme (gene fas).

    Other proteins contain a DHPS-like domain, including members of the methyltetrahydrofolate (corrinoid iron-sulphur protein methyltransferase (MeTr)) family. MeTr catalyses a key step in the Wood-Ljungdahl pathway of carbon dioxide fixation. Other members of this family that contain a DHPS-like domain include methionine synthase and methanogenic enzymes that activate the methyl group of methyltetrahydromethano(or -sarcino)pterin.

    Proteins where this domain is known:
    PF08_0095    PF10_0221   


    SSF51726 - UROD/MetE-like (Superfamily link)

    Proteins where this domain is known:
    PFF0360w   


    SSF51735 - NAD(P)-bd (Superfamily link)

    Interpro entry IPR016040 : NAD(P)-binding (Interpro link)

    Interpro description:

    This entry represents NAD- and NADP-binding domains with a core Rossmann-type fold, which consists of 3-layers alpha/beta/alpha, where the six beta strands are parallel in the order 321456. Many different enzymes contain an NAD/NADP-binding domain, including:

    Proteins where this domain is known:
    MAL13P1.284    PF08_0077    PF08_0132    PF10_0137    PF11_0097    PF11_0157    PF13_0141    PF13_0144    PF14_0164    PF14_0286    PF14_0357    PF14_0508    PF14_0511    PF14_0520    PF14_0598    PF14_0641    PFB0880w    PFC0260w    PFD0465c    PFD1035w    PFE0585c    PFE1050w    PFF0730c    PFF0895w    PFF1265w    PFF1490w    PFI1125c    PFL0780w   


    SSF51905 - SSF51905 (Superfamily link)

    Proteins where this domain is known:
    MAL8P1.154    PF07_0085    PF08_0066    PF08_0068    PF10_0275    PF10_0334    PF10_0373    PF14_0192    PF14_0334    PFC0275w    PFF0815w    PFI0505c    PFI0735c    PFI1170c    PFL0575w    PFL1550w    PFL2060c    PFL2115c   


    SSF51971 - SSF51971 (Superfamily link)

    Proteins where this domain is known:
    PF11_0407    PF14_0334   


    SSF51998 - SSF51998 (Superfamily link)

    Proteins where this domain is known:
    PF14_0352   


    SSF52016 - Aconitase/3IPM_dehydase_swvl (Superfamily link)

    Interpro entry IPR015928 : Aconitase/3-isopropylmalate dehydratase, swivel (Interpro link)

    Interpro description:

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents the 'swivel' domain found at the C-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the N-terminal region following the HEAT-like domain in bacterial AcnB. This domain has a three layer beta/beta/alpha structure, and in cytosolic Acn is known to rotate between the cAcn and IRP1 forms of the enzyme. This domain is also found in the small subunit of isopropylmalate dehydratase (LeuD).

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    SSF52021 - CP_synthsmall (Superfamily link)

    Interpro entry IPR002474 : Carbamoyl phosphate synthase, small subunit, N-terminal (Interpro link)

    Interpro description:

    Carbamoyl phosphate synthase (CPSase) is a heterodimeric enzyme composed of a small and a large subunit (with the exception of CPSase III, see below). CPSase catalyses the synthesis of carbamoyl phosphate from biocarbonate, ATP and glutamine or ammonia, and represents the first committed step in pyrimidine and arginine biosynthesis in prokaryotes and eukaryotes, and in the urea cycle in most terrestrial vertebrates. CPSase has three active sites, one in the small subunit and two in the large subunit. The small subunit contains the glutamine binding site and catalyses the hydrolysis of glutamine to glutamate and ammonia. The large subunit has two homologous carboxy phosphate domains, both of which have ATP-binding sites; however, the N-terminal carboxy phosphate domain catalyses the phosphorylation of biocarbonate, while the C-terminal domain catalyses the phosphorylation of the carbamate intermediate. The carboxy phosphate domain found duplicated in the large subunit of CPSase is also present as a single copy in the biotin-dependent enzymes acetyl-CoA carboxylase (ACC), propionyl-CoA carboxylase (PCCase), pyruvate carboxylase (PC) and urea carboxylase.

    Most prokaryotes carry one form of CPSase that participates in both arginine and pyrimidine biosynthesis, however certain bacteria can have separate forms. The large subunit in bacterial CPSase has four structural domains: the carboxy phosphate domain 1, the oligomerisation domain, the carbamoyl phosphate domain 2 and the allosteric domain. CPSase heterodimers from Escherichia coli contain two molecular tunnels: an ammonia tunnel and a carbamate tunnel. These inter-domain tunnels connect the three distinct active sites, and function as conduits for the transport of unstable reaction intermediates (ammonia and carbamate) between successive active sites. The catalytic mechanism of CPSase involves the diffusion of carbamate through the interior of the enzyme from the site of synthesis within the N-terminal domain of the large subunit to the site of phosphorylation within the C-terminal domain.

    Eukaryotes have two distinct forms of CPSase: a mitochondrial enzyme (CPSase I) that participates in both arginine biosynthesis and the urea cycle; and a cytosolic enzyme (CPSase II) involved in pyrimidine biosynthesis. CPSase II occurs as part of a multi-enzyme complex along with aspartate transcarbamoylase and dihydroorotase; this complex is referred to as the CAD protein. The hepatic expression of CPSase is transcriptionally regulated by glucocorticoids and/or cAMP. There is a third form of the enzyme, CPSase III, found in fish, which uses glutamine as a nitrogen source instead of ammonia. CPSase III is closely related to CPSase I, and is composed of a single polypeptide that may have arisen from gene fusion of the glutaminase and synthetase domains.

    This entry represents the N-terminal domain of the small subunit of carbamoyl phosphate synthase. The small subunit catalyses the hydrolysis of glutamine to ammonia, which in turn used by the large chain to synthesize carbamoyl phosphate. The small subunit has a 3-layer beta/beta/alpha structure, and is thought to be mobile in most proteins that carry it. The C-terminal domain of the small subunit of CPSase has glutamine amidotransferase activity.

    Proteins where this domain is known:
    PF13_0044   


    SSF52029 - SSF52029 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.283    PF10_0153    PF11_0331    PF14_0123    PFB0635w    PFC0285c    PFC0350c    PFC0900w    PFF0430w    PFL1425w    PFL1545c   


    SSF52042 - Ribosomal_L32e (Superfamily link)

    Interpro entry IPR001515 : Ribosomal protein L32e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The L32e family consists of proteins that have 135 to 240 amino-acid residues.

    Proteins where this domain is known:
    PFI0190w   


    SSF52047 - RNI-like (Superfamily link)

    Proteins where this domain is known:
    MAL7P1.25    PF10_0420    PF11_0243    PF14_0021    PF14_0506    PFC0930c    PFL1805c    PFL2380c   


    SSF52058 - L domain-like (Superfamily link)

    Proteins where this domain is known:
    MAL8P1.46    PF10_0320    PF11_0476    PF14_0257    PF14_0305    PF14_0403    PF14_0496    PF14_0785    PFE0455w    PFF0595c   


    SSF52075 - Outer arm dynein light chain 1 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.238    PF14_0651    PFI0330c    PFI1470c    PFL1360c   


    SSF52080 - SSF52080 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.209    PF14_0270    PF14_0276    PFF0885w   


    SSF52087 - CRAL_TRIO_C (Superfamily link)

    Interpro entry IPR001251 : (Interpro link)

    Interpro description:
    This entry defines the C-terminal of various retinaldehyde/retinal-binding proteins that may be functional components of the visual cycle. Cellular retinaldehyde-binding protein (CRALBP) carries 11-cis-retinol or 11-cis-retinaldehyde as endogenous ligands and may function as a substrate carrier protein that modulates interaction of these retinoids with visual cycle enzymes. The multidomain protein Trio binds the LAR transmembrane tyrosine phosphatase, contains a protein kinase domain, and has separate rac-specific and rho-specific guanine nucleotide exchange factor domains. Trio is a multifunctional protein that integrates and amplifies signals involved in coordinating actin remodeling, which is necessary for cell migration and growth.

    Other members of the family are transfer proteins that include, guanine nucleotide exchange factor that may function as an effector of RAC1, phosphatidylinositol/phosphatidylcholine transfer protein that is required for the transport of secretory proteins from the golgi complex and alpha-tocopherol transfer protein that enhances the transfer of the ligand between separate membranes.

    Proteins where this domain is known:
    MAL7P1.83    PF11_0287    PFF1280w    PFF1450w    PFI1015w   


    SSF52096 - SSF52096 (Superfamily link)

    Proteins where this domain is known:
    PF10_0167    PF14_0232    PF14_0348    PF14_0664    PFC0310c    PFL1940w   


    SSF52113 - BRCT (Superfamily link)

    Interpro entry IPR001357 : BRCT (Interpro link)

    Interpro description:

    The BRCT domain (after the C_terminal domain of a breast cancer susceptibility protein) is found predominantly in proteins involved in cell cycle checkpoint functions responsive to DNA damage, for example as found in the breast cancer DNA-repair protein BRCA1. The domain is an approximately 100 amino acid tandem repeat, which appears to act as a phospho-protein binding domain.

    A chitin biosynthesis protein from yeast also seems to belong to this group.

    Proteins where this domain is known:
    PF11_0090    PFB0895c    PFI0510c   


    SSF52129 - Caspase-like (Superfamily link)

    Proteins where this domain is known:
    PF13_0289   


    SSF52141 - UDNA_glycsylseSF (Superfamily link)

    Interpro entry IPR005122 : (Interpro link)

    Interpro description:

    This entry represents various uracil-DNA glycosylases and related DNA glycosylases, such as uracil-DNA glycosylase, thermophilic uracil-DNA glycosylase, G:T/U mismatch-specific DNA glycosylase (Mug), and single-strand selective monofunctional uracil-DNA glycosylase (SMUG1). These proteins have a 3-layer alpha/beta/alpha structure. Uracil-DNA glycosylases are DNA repair enzymes that excise uracil residues from DNA by cleaving the N-glycosylic bond, initiating the base excision repair pathway. Uracil in DNA can arise either through the deamination of cytosine to form mutagenic U:G mispairs, or through the incorporation of dUMP by DNA polymerase to form U:A pairs. These aberrant uracil residues are genotoxic. The sequence of uracil-DNA glycosylase is extremely well conserved in bacteria and eukaryotes as well as in herpes viruses. More distantly related uracil-DNA glycosylases are also found in poxviruses. In eukaryotic cells, UNG activity is found in both the nucleus and the mitochondria. Human UNG1 protein is transported to both the mitochondria and the nucleus. The N-terminal 77 amino acids of UNG1 seem to be required for mitochondrial localization, but the presence of a mitochondrial transit peptide has not been directly demonstrated. The most N-terminal conserved region contains an aspartic acid residue which has been proposed, based on X-ray structures to act as a general base in the catalytic mechanism.

    Proteins where this domain is known:
    PF14_0148   


    SSF52151 - Acyl_Trfase/lysoPlipase (Superfamily link)

    Interpro entry IPR016035 : Acyl transferase/acyl hydrolase/lysophospholipase (Interpro link)

    Interpro description:

    This entry represents a structural domain with a 3-layer alpha/beta/alpha topology. This domain can be found in acyl transferases such as bacterial malonyl-CoA ACP transacylase (FabD) and the homologous domain from eukaryotic fatty acid synthase . This domain is also found in lysophospholipases such as cytosolic phospholipase A2 (which has additional structural features), and in patatin proteins, which are plant glycoproteins that act as non-specific lipid acyl hydrolases.

    Proteins where this domain is known:
    MAL13P1.285    PF13_0066    PFB0410c    PFI1180w   


    SSF52156 - SSF52156 (Superfamily link)

    Proteins where this domain is known:
    PF08_0018    PF13_0069    PFE0830c    PFF0345w   


    SSF52161 - Ribosomal_L13 (Superfamily link)

    Interpro entry IPR005822 : Ribosomal protein L13 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L13 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L13 is known to be one of the early assembly proteins of the 50S ribosomal subunit.

    Proteins where this domain is known:
    PF10_0043    PFB0645c   


    SSF52166 - Ribosomal_L4/L1E (Superfamily link)

    Interpro entry IPR002136 : Ribosomal protein L4/L1e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This family includes ribosomal L4/L1 from eukaryotes and plants and L4 from bacteria. L4 from yeast has been shown to bind rRNA. These proteins have 246 (plant) to 427 (human) amino acids.

    Proteins where this domain is known:
    PF08_0038    PFE0350c   


    SSF52210 - CoA_ligase (Superfamily link)

    Interpro entry IPR016102 : (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of 3-layers, alpha/beta/alpha. This domain is found in both the alpha and beta chains of succinyl-CoA synthase GDP-forming) and(ADP-forming)). This domain can also be found in ATP citrate synthase (), malate-CoA ligase () and acetate-CoA ligase (or acetyl-CoA synthase) (), as well as bacterial Fdr. Some members of the domain utilise ATP others use GTP.

    Proteins where this domain is known:
    PF11_0097    PF14_0295    PF14_0357   


    SSF52218 - SSF52218 (Superfamily link)

    Proteins where this domain is known:
    PF14_0478    PFE1240w    PFI1140w   


    SSF52283 - SSF52283 (Superfamily link)

    Proteins where this domain is known:
    PF14_0508    PFE1050w   


    SSF52313 - Ribosomal_S2 (Superfamily link)

    Interpro entry IPR001865 : Ribosomal protein S2 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal S2 proteins have been shown to belong to a family that includes 40S ribosomal subunit 40kDa proteins, putative laminin-binding proteins, NAB-1 protein and 29.3kDa protein from Haloarcula marismortui. The laminin-receptor proteins are thus predicted to be the eukaryotic homologue of the eubacterial S2 risosomal proteins.

    Proteins where this domain is known:
    PF10_0264   


    SSF52317 - SSF52317 (Superfamily link)

    Proteins where this domain is known:
    PF10_0123    PF13_0044    PF14_0100    PFF1335c    PFI1100w   


    SSF52335 - SSF52335 (Superfamily link)

    Proteins where this domain is known:
    PF13_0044   


    SSF52343 - SSF52343 (Superfamily link)

    Proteins where this domain is known:
    PF13_0353    PF14_0478    PFF1115w    PFI1140w   


    SSF52374 - SSF52374 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.281    MAL13P1.86    MAL8P1.125    PF07_0018    PF08_0011    PF10_0053    PF10_0149    PF10_0340    PF11_0181    PF13_0159    PF13_0170    PF13_0179    PF13_0205    PF13_0253    PF13_0257    PF14_0589    PFC0470w    PFF1095w    PFI0680c    PFL0900c    PFL1210w    PFL2485c   


    SSF52402 - SSF52402 (Superfamily link)

    Proteins where this domain is known:
    PF10_0123    PF10_0147    PF10_0191    PF13_0137    PF14_0389    PFC0395w    PFD0555c    PFF0610c    PFI1310w    PFL1080c   


    SSF52425 - DNA_photolyase_N (Superfamily link)

    Interpro entry IPR006050 : DNA photolyase, N-terminal (Interpro link)

    Interpro description:

    DNA photolyases are enzymes that bind to DNA containing pyrimidine dimers: on absorption of visible light, they catalyse dimer splitting into the constituent monomers, a process called photoreactivation. This is a DNA repair mechanism, repairing mismatched pyrimidine dimers induced by exposure to ultra-violet light. The precise mechanisms involved in substrate binding, conversion of light energy to the mechanical energy needed to rupture the cyclobutane ring, and subsequent release of the product are uncertain. Analysis of DNA lyases has revealed the presence of an intrinsic chromophore, all monomers containing a reduced FAD moiety, and, in addition, either a reduced pterin or 8-hydroxy-5-diazaflavin as a second chromophore. Either chromophore may act as the primary photon acceptor, peak absorptions occurring in the blue region of the spectrum and in the UV-B region, at a wavelength around 290nm.

    This domain binds a light harvesting cofactor.

    Proteins where this domain is known:
    PFE0675c   


    SSF52440 - PreATP-grasp-like (Superfamily link)

    Interpro entry IPR016185 : PreATP-grasp-like fold (Interpro link)

    Interpro description:

    The ATP-grasp fold is one of several distinct ATP-binding folds, and is found in enzymes that catalyse the formation of amide bonds, catalysing the ATP-dependent ligation of a carboxylate-containing molecule to an amino or thiol group-containing molecule. This fold is found in many different enzyme families, including various peptide synthetases, biotin carboxylase, synapsin, succinyl-CoA synthetase, pyruvate phosphate dikinase, and glutathione synthetase, glutathionylspermidine synthase, amongst others. These enzymes contribute predominantly to macromolecular synthesis, using ATP-hydrolysis to activate their substrates.

    This entry represents the pre-ATP-grasp structural domain, which precedes the ATP-grasp domain in all superfamily members, and which usually occurs at the N-terminus of the protein. The structure of the pre-ATP-grasp domain consists of alpha/beta/alpha in three layers, and is possibly a rudiment form of the Rossmann-fold. This domain can have a substrate-binding function.

    Proteins where this domain is known:
    PF13_0044    PF14_0664    PFE0605c   


    SSF52467 - SSF52467 (Superfamily link)

    Proteins where this domain is known:
    PF13_0152    PF14_0125    PF14_0489    PF14_0508    PFF0945c   


    SSF52490 - Tubulin_FtsZ (Superfamily link)

    Interpro entry IPR003008 : Tubulin/FtsZ, GTPase (Interpro link)

    Interpro description:

    This domain is found in all tubulin chains, as well as the bacterial FtsZ family of proteins. These proteins are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ and tubulin are GTPases, this entry is the GTPase domain. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea.

    Proteins where this domain is known:
    PF08_0125    PF10_0084    PF14_0725    PFD1050w    PFI0180w    PFI1635w   


    SSF52499 - Iscrsm_hydrolase (Superfamily link)

    Interpro entry IPR000868 : Isochorismatase hydrolase (Interpro link)

    Interpro description:
    This is a family of hydrolase enzymes. Isochorismatase, also known as 2,3 dihydro-2,3 dihydroxybenzoate synthase catalyses the conversion of isochorismate, in the presence of water, to 2,3-dihydroxybenzoate and pyruvate.

    Proteins where this domain is known:
    PFC0910w   


    SSF52507 - Flavoprotein (Superfamily link)

    Interpro entry IPR003382 : Flavoprotein (Interpro link)

    Interpro description:
    This entry contains a diverse range of flavoprotein enzymes, including epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN. This enzyme catalyzes the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance. Dipicolinate synthase catalyses the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenylacrylic acid decarboxylase

    Proteins where this domain is known:
    MAL8P1.81   


    SSF52518 - SSF52518 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.186    PF08_0045    PF11_0256    PF13_0070    PF14_0441    PFE0225w    PFF0530w    PFF0945c   


    SSF52540 - SSF52540 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.13    MAL13P1.134    MAL13P1.14    MAL13P1.148    MAL13P1.164    MAL13P1.166    MAL13P1.192    MAL13P1.205    MAL13P1.216    MAL13P1.241    MAL13P1.243    MAL13P1.262    MAL13P1.294    MAL13P1.297    MAL13P1.322    MAL13P1.344    MAL13P1.40    MAL13P1.51    MAL13P1.96    MAL7P1.113    MAL7P1.12    MAL7P1.122    MAL7P1.162    MAL7P1.201    MAL7P1.206    MAL7P1.209    MAL7P1.89    MAL8P1.132    MAL8P1.144    MAL8P1.19    MAL8P1.33    MAL8P1.53    MAL8P1.65    MAL8P1.75a    MAL8P1.76    MAL8P1.92    MAL8P1.99    PF07_0023    PF07_0047    PF07_0062    PF07_0104    PF08_0018    PF08_0040    PF08_0042    PF08_0048    PF08_0062    PF08_0063    PF08_0078    PF08_0096    PF08_0100    PF08_0110    PF08_0111    PF08_0117    PF08_0126    PF10_0041    PF10_0057    PF10_0081    PF10_0086    PF10_0099    PF10_0203    PF10_0209    PF10_0224    PF10_0232    PF10_0294    PF10_0309    PF10_0337    PF10_0368    PF10_0369    PF11_0053    PF11_0071    PF11_0077    PF11_0078    PF11_0087    PF11_0117    PF11_0131    PF11_0143    PF11_0175    PF11_0183    PF11_0203    PF11_0225    PF11_0240    PF11_0245    PF11_0249    PF11_0296    PF11_0314    PF11_0317    PF11_0405    PF11_0414    PF11_0416    PF11_0461    PF11_0465    PF11_0466    PF11_0478    PF13_0033    PF13_0037    PF13_0063    PF13_0065    PF13_0069    PF13_0077    PF13_0090    PF13_0095    PF13_0119    PF13_0177    PF13_0187    PF13_0189    PF13_0218    PF13_0233    PF13_0261    PF13_0271    PF13_0273    PF13_0287    PF13_0291    PF13_0304    PF13_0305    PF13_0308    PF13_0330    PF13_0334    PF13_0350    PF14_0051    PF14_0052    PF14_0063    PF14_0081    PF14_0100    PF14_0104    PF14_0112    PF14_0114    PF14_0126    PF14_0133    PF14_0147    PF14_0159    PF14_0177    PF14_0183    PF14_0185    PF14_0221    PF14_0234    PF14_0244    PF14_0254    PF14_0278    PF14_0292    PF14_0321    PF14_0326    PF14_0339    PF14_0345    PF14_0370    PF14_0399    PF14_0400    PF14_0415    PF14_0429    PF14_0436    PF14_0437    PF14_0455    PF14_0477    PF14_0485    PF14_0486    PF14_0548    PF14_0563    PF14_0564    PF14_0593    PF14_0599    PF14_0601    PF14_0616    PF14_0626    PF14_0655    PFA0180w    PFA0185w    PFA0330w    PFA0335w    PFA0495c    PFA0530c    PFA0535c    PFA0545c    PFA0555c    PFA0590w    PFB0445c    PFB0500c    PFB0720c    PFB0730w    PFB0795w    PFB0840w    PFB0860c    PFB0895c    PFC0125w    PFC0140c    PFC0190c    PFC0260w    PFC0440c    PFC0565w    PFC0770c    PFC0860w    PFC0875w    PFC0915w    PFC0955w    PFC1010w    PFD0245c    PFD0305c    PFD0385c    PFD0530c    PFD0565c    PFD0665c    PFD0685c    PFD0710w    PFD0725c    PFD0755c    PFD0790c    PFD0810w    PFD0935c    PFD1060w    PFD1070w    PFE0155w    PFE0175c    PFE0205w    PFE0215w    PFE0270c    PFE0430w    PFE0450w    PFE0625w    PFE0665c    PFE0690c    PFE0705c    PFE0830c    PFE0925c    PFE1085w    PFE1090w    PFE1150w    PFE1215c    PFE1255w    PFE1345c    PFE1390w    PFE1435c    PFF0100w    PFF0115c    PFF0155w    PFF0225w    PFF0285c    PFF0345w    PFF0385c    PFF0625w    PFF0675c    PFF0810c    PFF0940c    PFF1140c    PFF1185w    PFF1500c    PFI0155c    PFI0165c    PFI0260c    PFI0355c    PFI0480w    PFI0525w    PFI0570w    PFI0860c    PFI0865w    PFI0910w    PFI1005w    PFI1420w    PFI1505c    PFI1550c    PFI1650w    PFL0075w    PFL0100c    PFL0150w    PFL0205w    PFL0380c    PFL0425c    PFL0495c    PFL0545w    PFL0560c    PFL0580w    PFL0835w    PFL0895c    PFL0975w    PFL1310c    PFL1410c    PFL1435c    PFL1500w    PFL1525c    PFL1590c    PFL1650w    PFL1710c    PFL1725w    PFL1925w    PFL2005w    PFL2010c    PFL2165w    PFL2190c    PFL2245w    PFL2345c    PFL2440w    PFL2465c    PFL2475w   


    SSF52743 - Pept_S8_S53 (Superfamily link)

    Interpro entry IPR000209 : Peptidase S8 and S53, subtilisin, kexin, sedolisin (Interpro link)

    Interpro description:

    Proteolytic enzymes that exploit serine in their catalytic activity are ubiquitous, being found in viruses, bacteria and eukaryotes. They include a wide range of peptidase activity, including exopeptidase, endopeptidase, oligopeptidase and omega-peptidase activity. Over 20 families (denoted S1 - S66) of serine protease have been identified, these being grouped into clans on the basis of structural similarity and other functional evidence. Structures are known for members of the clans and the structures indicate that some appear to be totally unrelated, suggesting different evolutionary origins for the serine peptidases.

    Not withstanding their different evolutionary origins, there are similarities in the reaction mechanisms of several peptidases. Chymotrypsin, subtilisin and carboxypeptidase C have a catalytic triad of serine, aspartate and histidine in common: serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The geometric orientations of the catalytic residues are similar between families, despite different protein folds. The linear arrangements of the catalytic residues commonly reflect clan relationships. For example the catalytic triad in the chymotrypsin clan (PA) is ordered HDS, but is ordered DHS in the subtilisin clan (SB) and SDH in the carboxypeptidase clan (SC).

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This group of serine peptidases belong to the MEROPS peptidase families S8 (subfamilies S8A (subtilisin) and S8B (kexin)) and S53 (sedolisin) both of which are members of clan SB.

    The subtilisin family is the second largest serine protease family characterised to date. Over 200 subtilises are presently known, more than 170 of which with their complete amino acid sequence. It is widespread, being found in eubacteria, archaebacteria, eukaryotes and viruses. The vast majority of the family are endopeptidases, although there is an exopeptidase, tripeptidyl peptidase. Structures have been determined for several members of the subtilisin family: they exploit the same catalytic triad as the chymotrypsins, although the residues occur in a different order (HDS in chymotrypsin and DHS in subtilisin), but the structures show no other similarity. Some subtilisins are mosaic proteins, and others contain N- and C-terminal extensions that show no sequence similarity to any other known protein. Based on sequence homology, a subdivision into six families has been proposed.

    The proprotein-processing endopeptidases kexin, furin and related enzymes form a distinct subfamily known as the kexin subfamily (S8B). These preferentially cleave C-terminally to paired basic amino acids. Members of this subfamily can be identified by subtly different motifs around the active site. Members of the kexin family, along with endopeptidases R, T and K from the yeast Tritirachium and cuticle-degrading peptidase from Metarhizium, require thiol activation. This can be attributed to the presence of Cys-173 near to the active histidine.Only 1 viral member of the subtilisin family is known, a 56-kDa protease from herpes virus 1, which infects the channel catfish.

    Sedolisins (serine-carboxyl peptidases) are proteolytic enzymes whose fold resembles that of subtilisin; however, they are considerably larger, with the mature catalytic domains containing approximately 375 amino acids. The defining features of these enzymes are a unique catalytic triad, Ser-Glu-Asp, as well as the presence of an aspartic acid residue in the oxyanion hole. High-resolution crystal structures have now been solved for sedolisin from Pseudomonas sp. 101, as well as for kumamolisin from a thermophilic bacterium, Bacillus sp. MN-32. Mutations in the human gene leads to a fatal neurodegenerative disease.

    Proteins where this domain is known:
    PF11_0381    PFE0355c    PFE0370c   


    SSF52768 - Arginase/deacetylase (Superfamily link)

    Proteins where this domain is known:
    PF10_0078    PF14_0690    PFI0320w    PFI1260c   


    SSF52777 - SSF52777 (Superfamily link)

    Proteins where this domain is known:
    PF10_0407    PF13_0121    PFC0170c   


    SSF52799 - SSF52799 (Superfamily link)

    Proteins where this domain is known:
    PF11_0139    PF11_0281    PF13_0027    PF14_0524    PF14_0525    PFC0380w   


    SSF52821 - Rhodanese-like (Superfamily link)

    Interpro entry IPR001763 : (Interpro link)

    Interpro description:

    Rhodanese, a sulphurtransferase involved in cyanide detoxification (see shares evolutionary relationship with a large family of proteins, including

    Rhodanese has an internal duplication. This domain is found as a single copy in other proteins, including phosphatases and ubiquitin C-terminal hydrolases.

    Proteins where this domain is known:
    PF13_0027    PFL0320w   


    SSF52833 - Thiordxn-like_fd (Superfamily link)

    Interpro entry IPR012336 : (Interpro link)

    Interpro description:

    Several biological processes regulate the activity of target proteins through changes in the redox state of thiol groups (S2 to SH2), where a hydrogen donor is linked to an intermediary disulphide protein. Such processes include the ferredoxin/thioredoxin system, the NADP/thioredoxin system, and the glutathione/glutaredoxin system. Several of these disulphide proteins share a common structure, consisting of a three-layer alpha/beta/alpha core. Proteins that contain domains with a thioredoxin fold include:

    Proteins where this domain is known:
    MAL13P1.100    MAL13P1.225    MAL7P1.159    MAL7P1.88    MAL8P1.17    PF07_0034    PF07_0036    PF08_0032    PF08_0131    PF10_0066    PF10_0134    PF10_0268    PF11_0055    PF11_0099    PF11_0286    PF11_0352    PF13_0214    PF13_0272    PF14_0186    PF14_0187    PF14_0368    PF14_0545    PF14_0590    PF14_0694    PFC0166w    PFC0205c    PFC0271c    PFE0820c    PFE1450c    PFF0340c    PFI0245c    PFI0790w    PFI0945w    PFI0950w    PFI1250w    PFL0595c    PFL0725w    PFL1520w   


    SSF52922 - Transketo_C_like (Superfamily link)

    Interpro entry IPR009014 : Transketolase, C-terminal/Pyruvate-ferredoxin oxidoreductase, domain II (Interpro link)

    Interpro description:

    Transketolase C-terminal-like domains can be found in a number of different enzymes, including the C-terminal domain of the pyruvate dehydrogenase E1 component, the C-terminal domain of branched-chain alpha-keto acid dehydrogenases, and domain II of pyruvate-ferredoxin oxidoreductase (PFOR). Structural studies reveal this domain to comprise of three layers alpha/beta/alpha. The mixed beta sheet consists of five strands in the order 13245, where strand 1 is antiparallel to the others.

    Proteins where this domain is known:
    MAL13P1.186    PF14_0441    PFE0225w    PFF0530w   


    SSF52935 - Pyruvate_kinase (Superfamily link)

    Interpro entry IPR015795 : (Interpro link)

    Interpro description:

    Pyruvate kinase (PK) catalyses the final step in glycolysis, the conversion of phosphoenolpyruvate to pyruvate with concomitant phosphorylation of ADP to ATP:

     ADP + phosphoenolpyruvate = ATP + pyruvate 

    The enzyme, which is found in all living organisms, requires both magnesium and potassium ions for its activity. In vertebrates, there are four tissue-specific isozymes: L (liver), R (red cells), M1 (muscle, heart and brain), and M2 (early foetal tissue). In plants, PK exists as cytoplasmic and plastid isozymes, while most bacteria and lower eukaryotes have one form, except in certain bacteria, such as Escherichia coli, that have two isozymes. All isozymes appear to be tetramers of identical subunits of ~500 residues.

    PK helps control the rate of glycolysis, along with phosphofructokinase and hexokinase. PK possesses allosteric sites for numerous effectors, yet the isozymes respond differently, in keeping with their different tissue distributions. The activity of L-type (liver) PK is increased by fructose-1,6-bisphosphate (F1,6BP) and lowered by ATP and alanine (gluconeogenic precursor), therefore when glucose levels are high, glycolysis is promoted, and when levels are low, gluconeogenesis is promoted. L-type PK is also hormonally regulated, being activated by insulin and inhibited by glucagon, which covalently modifies the PK enzyme. M1-type (muscle, brain) PK is inhibited by ATP, but F1,6BP and alanine have no effect, which correlates with the function of muscle and brain, as opposed to the liver.

    The structure of several pyruvate kinases from various organisms have been determined. The protein comprises three-four domains: a small N-terminal helical domain (absent in bacterial PK), a beta/alpha-barrel domain, a beta-barrel domain (inserted within the beta/alpha-barrel domain), and a 3-layer alpha/beta/alpha sandwich domain.

    This entry represents the 3-layer alpha/beta/alpha sandwich domain. This domain has a similar topology to the archaeal hypothetical protein, MTH1675 from Methanobacterium thermoautotrophicum.

    Proteins where this domain is known:
    PF10_0363    PFF1300w   


    SSF52943 - ATPase_gamma (Superfamily link)

    Interpro entry IPR000131 : ATPase, F1 complex, gamma subunit (Interpro link)

    Interpro description:

    ATPases (or ATP synthases) are membrane-bound enzyme complexes/ion transporters that combine ATP synthesis and/or hydrolysis with the transport of protons across a membrane. ATPases can harness the energy from a proton gradient, using the flux of ions across the membrane via the ATPase proton channel to drive the synthesis of ATP. Some ATPases work in reverse, using the energy from the hydrolysis of ATP to create a proton gradient. There are different types of ATPases, which can differ in function (ATP synthesis and/or hydrolysis), structure (F-, V- and A-ATPases contain rotary motors) and in the type of ions they transport.

    F-ATPases (also known as F1F0-ATPase, or H(+)-transporting two-sector ATPase) are composed of two linked complexes: the F1 ATPase complex is the catalytic core and is composed of 5 subunits (alpha, beta, gamma, delta, epsilon), while the F0 ATPase complex is the membrane-embedded proton channel that is composed of at least 3 subunits (A-C), nine in mitochondria (A-G, F6, F8). Both the F1 and F0 complexes are rotary motors that are coupled back-to-back. In the F1 complex, the central gamma subunit forms the rotor inside the cylinder made of the alpha(3)beta(3) subunits, while in the F0 complex, the ring-shaped C subunits forms the rotor. The two rotors rotate in opposite directions, but the F0 rotor is usually stronger, using the force from the proton gradient to push the F1 rotor in reverse in order to drive ATP synthesis . These ATPases can also work in reverse to hydrolyse ATP to create a proton gradient.

    The ATPase F1 complex gamma subunit forms the central shaft that connects the F0 rotary motor to the F1 catalytic core. The gamma subunit functions as a rotary motor inside the cylinder formed by the alpha(3)beta(3) subunits in the F1 complex. The best-conserved region of the gamma subunit is its C-terminus, which seems to be essential for assembly and catalysis.

    More information about this protein can be found at Protein of the Month: ATP Synthases.

    Proteins where this domain is known:
    PF13_0061   


    SSF52949 - SSF52949 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.74    MAL7P1.83    PF14_0439    PF14_0466   


    SSF52954 - Anticodon_bd (Superfamily link)

    Interpro entry IPR004154 : Anticodon-binding (Interpro link)

    Interpro description:
    tRNA synthetases, or tRNA ligases are involved in protein synthesis. This domain is found in histidyl, glycyl, threonyl and prolyl tRNA synthetases it is probably the anticodon binding domain.

    Proteins where this domain is known:
    PF11_0270    PF14_0198    PF14_0428    PFI1240c    PFI1645c    PFL0670c   


    SSF52972 - SSF52972 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.69    MAL7P1.110    PFI0310w   


    SSF52980 - Restrict_endonuc_II-like_core (Superfamily link)

    Interpro entry IPR011335 : Restriction endonuclease, type II-like, core (Interpro link)

    Interpro description:

    Type II restriction endonucleases are components of prokaryotic DNA restriction-modification mechanisms that protect the organism against invading foreign DNA. These site-specific deoxyribonucleases catalyse the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. Of the 3000 restriction endonucleases that have been characterised, most are homodimeric or tetrameric enzymes that cleave target DNA at sequence-specific sites close to the recognition site. For homodimeric enzymes, the recognition site is usually a palindromic sequence 4-8 bp in length. Most enzymes require magnesium ions as a cofactor for catalysis. Although they can vary in their mode of recognition, many restriction endonucleases share a similar structural core comprising four beta-strands and one alpha-helix, as well as a similar mechanism of cleavage, suggesting a common ancestral origin. However, there is still considerable diversity amongst restriction endonucleases. The target site recognition process triggers large conformational changes of the enzyme and the target DNA, leading to the activation of the catalytic centres. Like other DNA binding proteins, restriction enzymes are capable of non-specific DNA binding as well, which is the prerequisite for efficient target site location by facilitated diffusion. Non-specific binding usually does not involve interactions with the bases but only with the DNA backbone.

    There are four classes of restriction endonucleases: types I, II,III and IV. All types of enzymes recognise specific short DNA sequences and carry out the endonucleolytic cleavage of DNA to give specific double-stranded fragments with terminal 5'-phosphates. They differ in their recognition sequence, subunit composition, cleavage position, and cofactor requirements, as summarised below:

    This entry represents the core structure found in most type II restriction endonucleases, consisting of a 3-layer alpha/beta/alpha topology with mixed beta-sheets. This core structure can be found in the restriction endonucleases EcoRI, EcoRV, BamHI, BglI, BglII, BstyI, PvuII, MunI, NseI, NgoIV, BsobI, HincII, MspI, FokI (C-terminal), EcoO109IR, as well as in lamba exonuclease, DNA mismatch repair protein MutH, VSR (very short repair) endonucleases, TnsA endonucleases (N-terminal), endonucleases I (Holliday junction resolvase), Hjc-like enzymes, XPF/Rad1/Mus81 nucleases, RecB and RecC exodeoxyribonuclease V (C-terminal), and RecU-like enzymes.

    Proteins where this domain is known:
    MAL13P1.346    PF14_0470   


    SSF53032 - tRNA_int_endo_C (Superfamily link)

    Interpro entry IPR006677 : tRNA intron endonuclease, catalytic domain-like (Interpro link)

    Interpro description:

    This entry represents a 3-layer alpha/beta/alpha domain found as the catalytic domain at the C-terminal in homotetrameric tRNA-intron endonucleases, and as domains 2 and 4 (C-terminal) in the homodimeric enzymes. tRNA-intron endonucleases remove tRNA introns by cleaving pre-tRNA at the 5'- and 3'-splice sites to release the intron. The products are an intron and two tRNA half-molecules bearing 2',3' cyclic phosphate and 5'-hydroxyl termini. These enzymes recognise a pseudosymmetric substrate in which 2 bulged loops of 3 bases are separated by a stem of 4 bp. Although homotetrameric enzymes contain four active sites, only two participate in the cleavage, and should therefore, be considered as a dimer of dimers.

    Proteins where this domain is known:
    PF14_0514    PFL2300w   


    SSF53036 - RNA_pol_Rpb5_N (Superfamily link)

    Interpro entry IPR005571 : RNA polymerase, Rpb5, N-terminal (Interpro link)

    Interpro description:

    Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).

    This entry represents the N-terminal domain of eukaryotic RPB5, which has a core structure consisting of 3 layers alpha/beta/alpha. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA.

    Proteins where this domain is known:
    PF13_0341   


    SSF53067 - SSF53067 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.540    MAL7P1.153    MAL7P1.228    PF07_0033    PF07_0077    PF08_0054    PF10_0299    PF11_0047    PF11_0114    PF11_0351    PF13_0269    PF14_0124    PF14_0218    PFA0190c    PFD0440w    PFD0487c    PFE0255w    PFF1155w    PFI0520w    PFI0875w    PFL2215w   


    SSF53092 - SSF53092 (Superfamily link)

    Proteins where this domain is known:
    PF14_0517   


    SSF53098 - RNaseH_fold (Superfamily link)

    Interpro entry IPR012337 : Polynucleotidyl transferase, Ribonuclease H fold (Interpro link)

    Interpro description:

    The catalytic domain of several polynucleotidyl transferases share a similar structure, consisting of a 3-layer alpha/beta/alpha fold that contains mixed beta sheets, suggesting that they share a similar mechanism of catalysis. Polynucleotidyl transferases containing this domain include ribonuclease H class I (RNase HI) and class II (RNase HII), HIV RNase (reverse transcriptase domain), retroviral integrase (catalytic domain), Mu transposase (core domain), transposase inhibitor Tn5 (containing additional all-alpha subdomains), DnaQ-like 3Â-5Â exonucleases (exonuclease domains), RuvC resolvase, and mitochondrial resolvase ydc2 (catalytic domain).

    Proteins where this domain is known:
    MAL13P1.311    MAL8P1.104    MAL8P1.35    PF10_0165    PF10_0362    PF13_0208    PF14_0112    PF14_0413    PFA0290w    PFB0215c    PFD0590c    PFF1150w    PFF1470c   


    SSF53137 - SSF53137 (Superfamily link)

    Proteins where this domain is known:
    PF14_0230    PF14_0519    PFB0550w    PFE0810c    PFF0650w   


    SSF53150 - DNA_mismatch_repair_MutS_connt (Superfamily link)

    Interpro entry IPR007860 : DNA mismatch repair protein MutS, connector (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the connector domain (domain 2) found in proteins of the MutS family. The structure of the MutS connector domain consists of a parallel beta-sheet surrounded by four alpha helices, which is similar to the structure of the Holliday junction resolvase ruvC.

    Proteins where this domain is known:
    PFE0270c   


    SSF53167 - Purine and uridine phosphorylases (Superfamily link)

    Proteins where this domain is known:
    PFE0660c   


    SSF53187 - SSF53187 (Superfamily link)

    Proteins where this domain is known:
    PF14_0439    PFA0170c    PFI1570c   


    SSF53223 - SSF53223 (Superfamily link)

    Proteins where this domain is known:
    PF08_0132    PF14_0164    PF14_0286    PFF1490w   


    SSF53244 - Mur_ligase_C (Superfamily link)

    Interpro entry IPR004101 : Mur ligase, C-terminal (Interpro link)

    Interpro description:

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.

    This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes the C-terminal domain of folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).

    The C-terminal domain is almost always associated with the cytoplasmic peptidoglycan synthetases, N-terminal domain.

    Proteins where this domain is known:
    PF13_0140   


    SSF53254 - SSF53254 (Superfamily link)

    Proteins where this domain is known:
    PF11_0208    PF14_0094    PF14_0282    PFB0380c    PFC0430w    PFD0660w   


    SSF53271 - PRTase-like (Superfamily link)

    Proteins where this domain is known:
    PF10_0121    PF13_0143    PF13_0157    PFE0630c   


    SSF53300 - SSF53300 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.76    PF08_0036    PF08_0109    PF08_0136b    PF13_0201    PF13_0324    PF14_0326    PFC0640w    PFD0250c    PFF0380w    PFF0800w   


    SSF53328 - formyl_transf (Superfamily link)

    Interpro entry IPR002376 : Formyl transferase, N-terminal (Interpro link)

    Interpro description:
    A number of formyl transferases belong to this group. Methionyl-tRNA formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA. The formyl group appears to play a dual role in the initiator identity of N-formylmethionyl-tRNA by promoting its recognition by IF2 and by impairing its binding to EFTU-GTP. Formyltetrahydrofolate dehydrogenase produces formate from formyl- tetrahydrofolate. This is the N-terminal domain of these enzymes and is found upstream of the C-terminal domain.

    The trifunctional glycinamide ribonucleotide synthetase-aminoimidazole ribonucleotide synthetase-glycinamide ribonucleotide transformylase catalyses the second, third and fifth steps in de novo purine biosynthesis. The glycinamide ribonucleotide transformylase belongs to this group.

    Proteins where this domain is known:
    MAL13P1.67   


    SSF53335 - SSF53335 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.214    MAL13P1.255    MAL13P1.31    MAL7P1.130    MAL7P1.151    PF07_0015    PF07_0020    PF07_0123    PF08_0092    PF10_0179    PF10_0197    PF10_0215    PF10_0274    PF11_0116    PF11_0301    PF11_0305    PF11_0348    PF11_0439    PF11_0468    PF11_0482    PF13_0016    PF13_0052    PF13_0087    PF13_0236    PF13_0286    PF13_0323    PF14_0068    PF14_0156    PF14_0242    PF14_0309    PF14_0376    PF14_0481    PF14_0526    PFB0220w    PFC0390w    PFD0350w    PFD0460c    PFD1080w    PFE1115c    PFE1535w    PFI0415c    PFI0700c    PFI0815c    PFI1235w    PFL0125c    PFL1475w    PFL1775c    PFL2230c    PFL2305w    PFL2395c   


    SSF53383 - PyrdxlP-dep_Trfase_major (Superfamily link)

    Interpro entry IPR015424 : (Interpro link)

    Interpro description:

    Pyridoxal phosphate is the active form of vitamin B6 (pyridoxine or pyridoxal). PLP is a versatile catalyst, acting as a coenzyme in a multitude of reactions, including decarboxylation, deamination and transamination . PLP-dependent enzymes are primarily involved in the biosynthesis of amino acids and amino acid-derived metabolites, but they are also found in the biosynthetic pathways of amino sugars and in the synthesis or catabolism of neurotransmitters; pyridoxal phosphate can also inhibit DNA polymerases and several steroid receptors . Inadequate levels of pyridoxal phosphate in the brain can cause neurological dysfunction, particularly epilepsy.

    PLP enzymes exist in their resting state as a Schiff base, the aldehyde group of PLP forming a linkage with the epsilon-amino group of an active site lysine residue on the enzyme. The alpha-amino group of the substrate displaces the lysine epsilon-amino group, in the process forming a new aldimine with the substrate. This aldimine is the common central intermediate for all PLP-catalysed reactions, enzymatic and non-enzymatic.

    This entry represents the major region of PLP-dependent transferases. This domain has a three layer alpha/beta/alpha sandwich topology, with mixed beta-sheets of 7 strands. The major region can be found in the following PLP-dependent transferase families:

    Proteins where this domain is known:
    MAL7P1.150    PF07_0068    PF14_0155    PF14_0534    PFB0200c    PFD0285c    PFF0435w    PFL0255c    PFL1720w    PFL2210w   


    SSF53448 - SSF53448 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.144    MAL13P1.218    PF11_0427    PF14_0774    PFA0340w    PFE0875c    PFL0675c   


    SSF53474 - SSF53474 (Superfamily link)

    Proteins where this domain is known:
    MAL7P1.156    MAL7P1.178    MAL8P1.138    MAL8P1.38    MAL8P1.91    PF07_0005    PF07_0040    PF08_0022    PF10_0018    PF10_0020    PF10_0379    PF11_0168    PF11_0211    PF11_0276    PF11_0356    PF11_0441    PF13_0032    PF13_0153    PF14_0015    PF14_0017    PF14_0099    PF14_0250    PF14_0395    PF14_0556    PF14_0737    PF14_0738    PFA0120c    PFC0065c    PFC0950c    PFD0185c    PFF1420w    PFF1460c    PFI1775w    PFI1800w    PFL0295c    PFL2530w   


    SSF53597 - Dihydrofolate reductases (Superfamily link)

    Proteins where this domain is known:
    PFD0830w   


    SSF53613 - SSF53613 (Superfamily link)

    Proteins where this domain is known:
    PF11_0453    PFE1030c    PFF0775w    PFL1920c   


    SSF53623 - Mur_ligase_cen (Superfamily link)

    Interpro entry IPR013221 : Mur ligase, central (Interpro link)

    Interpro description:

    The bacterial cell wall provides strength and rigidity to counteract internal osmotic pressure, and protection against the environment. The peptidoglycan layer gives the cell wall its strength, and helps maintain the overall shape of the cell. The basic peptidoglycan structure of both Gram-positive and Gram-negative bacteria is comprised of a sheet of glycan chains connected by short cross-linking polypeptides. Biosynthesis of peptidoglycan is a multi-step (11-12 steps) process comprising three main stages:

    Stage two involves four key Mur ligase enzymes: MurC, MurD, MurE and MurF. These four Mur ligases are responsible for the successive additions of L-alanine, D-glutamate, meso-diaminopimelate or L-lysine, and D-alanyl-D-alanine to UDP-N-acetylmuramic acid. All four Mur ligases are topologically similar to one another, even though they display low sequence identity. They are each composed of three domains: an N-terminal Rossmann-fold domain responsible for binding the UDPMurNAc substrate; a central domain (similar to ATP-binding domains of several ATPases and GTPases); and a C-terminal domain (similar to dihydrofolate reductase fold) that appears to be associated with binding the incoming amino acid. The conserved sequence motifs found in the four Mur enzymes also map to other members of the Mur ligase family, including folylpolyglutamate synthetase, cyanophycin synthetase and the capB enzyme from Bacillales.

    This entry represents the C-terminal domain from all four stage 2 Mur enzymes: UDP-N-acetylmuramate-L-alanine ligase (MurC), UDP-N-acetylmuramoylalanine-D-glutamate ligase (MurD), UDP-N-acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (MurE), and UDP-N-acetylmuramoyl-tripeptide-D-alanyl-D-alanine ligase (MurF). This entry also includes folylpolyglutamate synthase that transfers glutamate to folylpolyglutamate and cyanophycin synthetase that catalyses the biosynthesis of the cyanobacterial reserve material multi-L-arginyl-poly-L-aspartate (cyanophycin).

    Proteins where this domain is known:
    PF13_0140   


    SSF53649 - Alkaline_phosphatase_core (Superfamily link)

    Interpro entry IPR017850 : Alkaline-phosphatase-like, core domain (Interpro link)

    Interpro description:

    This entry represents a structural domain with a core 3-layer alpha/beta/alpha structure, which can sometimes contain additional subdomains (also covered by this entry). These domains form the core domain of alkaline phosphatases. This structural domain is found in:

    Proteins where this domain is known:
    PFL0685w   


    SSF53659 - SSF53659 (Superfamily link)

    Proteins where this domain is known:
    PF13_0242   


    SSF53671 - Asp/Orn_carbamoyltranf (Superfamily link)

    Interpro entry IPR006130 : Aspartate/ornithine carbamoyltransferase (Interpro link)

    Interpro description:

    This family contains two related enzymes:

    1. Aspartate carbamoyltransferase (ATCase) catalyses the conversion of aspartate and carbamoyl phosphate to carbamoylaspartate, the second step in the de novo biosynthesis of pyrimidine nucleotides. In prokaryotes ATCase consists of two subunits: a catalytic chain (gene pyrB) and a regulatory chain (gene pyrI), while in eukaryotes it is a domain in a multi-functional enzyme (called URA2 in yeast, rudimentary in Drosophila, and CAD in mammals) that also catalyses other steps of the biosynthesis of pyrimidines.
    2. Ornithine carbamoyltransferase (OTCase) catalyses the conversion of ornithine and carbamoyl phosphate to citrulline. In mammals this enzyme participates in the urea cycle and is located in the mitochondrial matrix. In prokaryotes and eukaryotic microorganisms it is involved in the biosynthesis of arginine. In some bacterial species it is also involved in the degradation of arginine (the arginine deaminase pathway).

    It has been shown that these two enzymes are evolutionary related. The predicted secondary structure of both enzymes are similar and there are some regions of sequence similarities. One of these regions includes three residues which have been shown, by crystallographic studies, to be implicated in binding the phosphoryl group of carbamoyl phosphate.

    Proteins where this domain is known:
    MAL13P1.221   


    SSF53697 - SSF53697 (Superfamily link)

    Proteins where this domain is known:
    PF10_0245    PF14_0341   


    SSF53732 - Aconitase_N (Superfamily link)

    Interpro entry IPR001030 : Aconitase/3-isopropylmalate dehydratase large subunit, alpha/beta/alpha (Interpro link)

    Interpro description:

    3-isopropylmalate dehydratase (or isopropylmalate isomerase; catalyses the stereo-specific isomerisation of 2-isopropylmalate and 3-isopropylmalate, via the formation of 2-isopropylmaleate. This enzyme performs the second step in the biosynthesis of leucine, and is present in most prokaryotes and many fungal species. The prokaryotic enzyme is a heterodimer composed of a large (LeuC) and small (LeuD) subunit, while the fungal form is a monomeric enzyme. Both forms of isopropylmalate are related and are part of the larger aconitase family. Aconitases are mostly monomeric proteins which share four domains in common and contain a single, labile [4Fe-4S] cluster. Three structural domains (1, 2 and 3) are tightly packed around the iron-sulphur cluster, while a fourth domain (4) forms a deep active-site cleft. The prokaryotic enzyme is encoded by two adjacent genes, leuC and leuD, corresponding to aconitase domains 1-3 and 4 respectively. LeuC does not bind an iron-sulphur cluster. It is thought that some prokaryotic isopropylamalate dehydrogenases can also function as homoaconitase converting cis-homoaconitate to homoisocitric acid in lysine biosynthesis. Homoaconitase has been identified in higher fungi (mitochondria) and several archaea and one thermophilic species of bacteria, Thermus thermophilus.

    Aconitase (aconitate hydratase; is an iron-sulphur protein that contains a [4Fe-4S]-cluster and catalyses the interconversion of isocitrate and citrate via a cis-aconitate intermediate. Aconitase functions in both the TCA and glyoxylate cycles, however unlike the majority of iron-sulphur proteins that function as electron carriers, the [4Fe-4S]-cluster of aconitase reacts directly with an enzyme substrate. In eukaryotes there is a cytosolic form (cAcn) and a mitochondrial form (mAcn) of the enzyme. In bacteria there are also 2 forms, aconitase A (AcnA) and B (AcnB). Several aconitases are known to be multi-functional enzymes with a second non-catalytic, but essential function that arises when the cellular environment changes, such as when iron levels drop. Eukaryotic cAcn and mAcn, and bacterial AcnA have the same domain organisation, consisting of three N-terminal alpha/beta/alpha domains, a linker region, followed by a C-terminal 'swivel' domain with a beta/beta/alpha structure (1-2-3-linker-4), although mAcn is small than cAcn. However, bacterial AcnB has a different organisation: it contains an N-terminal HEAT-like domain, followed by the 'swivel' domain, then the three alpha/beta/alpha domains (HEAT-4-1-2-3). Below is a description of some of the multi-functional activities associated with different aconitases.

    This entry represents a region containing 3 domains, each with a 3-layer alpha/beta/alpha topology. This regions represents the [4Fe-4S] cluster-binding region found at the N-terminal of eukaryotic mAcn, cAcn/IPR1 and IRP2, and bacterial AcnA, but in the C-terminal of bacterial AcnB. This domain is also found in the large subunit of isopropylmalate dehydratase (LeuC).

    More information about these proteins can be found at Protein of the Month: Aconitase.

    Proteins where this domain is known:
    PF13_0229   


    SSF53738 - A-D-PHexomutase_a/b/a-I/II/III (Superfamily link)

    Interpro entry IPR016055 : Alpha-D-phosphohexomutase, alpha/beta/alpha I, II and III (Interpro link)

    Interpro description:

    The alpha-D-phosphohexomutase superfamily is composed of four related enzymes, each of which catalyses a phosphoryl transfer on their sugar substrates: phosphoglucomutase (PGM), phosphoglucomutase/phosphomannomutase (PGM/PMM), phosphoglucosamine mutase (PNGM), and phosphoacetylglucosamine mutase (PAGM). PGM converts D-glucose 1-phosphate into D-glucose 6-phosphate, and participates in both the breakdown and synthesis of glucose. PGM/PMM () are primarily bacterial enzymes that use either glucose or mannose as substrate, participating in the biosynthesis of a variety of carbohydrates such as lipopolysaccharides and alginate. Both PNGM () and PAGM () are involved in the biosynthesis of UDP-N-acetylglucosamine.

    Despite differences in substrate specificity, these enzymes share a similar catalytic mechanism, converting 1-phospho-sugars to 6-phospho-sugars via a biphosphorylated 1,6-phospho-sugar. The active enzyme is phosphorylated at a conserved serine residue and binds one magnesium ion; residues around the active site serine are well conserved among family members. The reaction mechanism involves phosphoryl transfer from the phosphoserine to the substrate to create a biophosphorylated sugar, followed by a phosphoryl transfer from the substrate back to the enzyme.

    The structures of PGM and PGM/PMM have been determined, and were found to be very similar in topology. These enzymes are both composed of four domains and a large central active site cleft, where each domain contains residues essential for catalysis and/or substrate recognition. Domain I contains the catalytic phosphoserine, domain II contains a metal-binding loop to coordinate the magnesium ion, domain III contains the sugar-binding loop that recognises the two different binding orientations of the 1- and 6-phospho-sugars, and domain IV contains a phosphate-binding site required for orienting the incoming phospho-sugar substrate.

    This entry represents domains I, II and III found in alpha-D-phosphohexomutase enzymes. All three domains share a 3-layer alpha/beta/alpha topology.

    Proteins where this domain is known:
    PF10_0122    PF11_0311   


    SSF53748 - PGK (Superfamily link)

    Interpro entry IPR001576 : Phosphoglycerate kinase (Interpro link)

    Interpro description:

    Phosphoglycerate kinase (PGK) is an enzyme that catalyses the formation of ATP to ADP and vice versa. In the second step of the second phase in glycolysis, 1,3-diphosphoglycerate is converted to 3-phosphoglycerate, forming one molecule of ATP. If the reverse were to occur, one molecule of ADP would be formed. This reaction is essential in most cells for the generation of ATP in aerobes, for fermentation in anaerobes and for carbon fixation in plants.

    PGK is found in all living organisms and its sequence has been highly conserved throughout evolution. The enzyme exists as a monomer containing two nearly equal-sized domains that correspond to the N- and C-termini of the protein (the last 15 C-terminal residues loop back into the N-terminal domain). 3-phosphoglycerate (3-PG) binds to the N-terminal, while the nucleotide substrates, MgATP or MgADP, bind to the C-terminal domain of the enzyme. This extended two-domain structure is associated with large-scale 'hinge-bending' conformational changes, similar to those found in hexokinase. At the core of each domain is a 6-stranded parallel beta-sheet surrounded by alpha helices. Domain 1 has a parallel beta-sheet of six strands with an order of 342156, while domain 2 has a parallel beta-sheet of six strands with an order of 321456. Analysis of the reversible unfolding of yeast phosphoglycerate kinase leads to the conclusion that the two lobes are capable of folding independently, consistent with the presence of intermediates on the folding pathway with a single domain folded.

    Phosphoglycerate kinase (PGK) deficiency is associated with haemolytic anaemia and mental disorders in man.

    This entry represents the full PGK enzyme.

    Proteins where this domain is known:
    PFI1105w   


    SSF53756 - SSF53756 (Superfamily link)

    Proteins where this domain is known:
    PF10_0316   


    SSF53784 - Ppfruckinase (Superfamily link)

    Interpro entry IPR000023 : Phosphofructokinase (Interpro link)

    Interpro description:
    The enzyme-catalysed transfer of a phosphoryl group from ATP is an important reaction in a wide variety of biological processes. One enzyme that utilises this reaction is phosphofructokinase (PFK), which catalyses the phosphorylation of fructose-6-phosphate to fructose-1,6- bisphosphate, a key regulatory step in the glycolytic pathway. PFK exists as a homotetramer in bacteria and mammals (where each monomer possesses 2 similar domains), and as an octomer in yeast (where there are 4 alpha- (PFK1) and 4 beta-chains (PFK2), the latter, like the mammalian monomers, possessing 2 similar domains).

    PFK is ~300 amino acids in length, and structural studies of the bacterial enzyme have shown it comprises two similar (alpha/beta) lobes: one involved in ATP binding and the other housing both the substrate-binding site and the allosteric site (a regulatory binding site distinct from the active site, but that affects enzyme activity). The identical tetramer subunits adopt 2 different conformations: in a 'closed' state, the bound magnesium ion bridges the phosphoryl groups of the enzyme products (ADP and fructose-1,6- bisphosphate); and in an 'open' state, the magnesium ion binds only the ADP, as the 2 products are now further apart. These conformations are thought to be successive stages of a reaction pathway that requires subunit closure to bring the 2 molecules sufficiently close to react.

    Deficiency in PFK leads to glycogenosis type VII (Tauri's disease), an autosomal recessive disorder characterised by severe nausea, vomiting, muscle cramps and myoglobinuria in response to bursts of intense or vigorous exercise. Sufferers are usually able to lead a reasonably ordinary life by learning to adjust activity levels.

    Proteins where this domain is known:
    PF11_0294    PFI0755c   


    SSF53790 - Cor/por_Metransf (Superfamily link)

    Interpro entry IPR000878 : Tetrapyrrole methylase (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including cobalamin (vitamin B12), haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    This entry represents several tetrapyrrole methylases, which consist of two non-similar domains. These enzymes catalyse the methylation of their substrates using S-adenosyl-L-methionine as a methyl source. Enzymes in this family include:

    Proteins where this domain is known:
    PF10_0087   


    SSF53795 - PEP carboxykinase-like (Superfamily link)

    Proteins where this domain is known:
    PF13_0234   


    SSF53800 - SSF53800 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.326   


    SSF53850 - Periplasmic binding protein-like II (Superfamily link)

    Proteins where this domain is known:
    PF07_0049    PFL0480w   


    SSF53901 - Thiolase-like (Superfamily link)

    Interpro entry IPR016039 : Thiolase-like (Interpro link)

    Interpro description:

    This entry represents a structural domain with a thiolase-like 3-layer alpha/beta/alpha topology. This domain usually occurs in two similar copies that are related by a pseudo-dyad, and which arose through duplication. The proteins in this entry can be split into two groups: those related to thiolase, and those related to chalcone synthase. The thiolase-like enzymes include:

    The chalcone synthase-like enzymes include:

    Proteins where this domain is known:
    PF14_0484    PFB0505c    PFF1275c   


    SSF53920 - Fe_hydrog (Superfamily link)

    Interpro entry IPR009016 : (Interpro link)

    Interpro description:

    The iron-only hydrogenases catalyse the two-electron reduction of two protons to yield dihydrogen, as part of an energy cycle. Fe-only hydrogenases are restricted to strictly anaerobic microbes, and are often very sensitive to molecular oxygen. The cytoplasmic monomeric Fe hydrogenases are involved in hydrogen production, while the periplasmic, heterodimeric Fe hydrogenases are involved in hydrogen uptake. Fe hydrogenases consist of two, intertwined domains, the catalytic domain and the larger subunit C domain. The larger subunit C domain can be divided into three subdomains. There are five distinct metal clusters, one in the catalytic domain. This entry represents the two intertwined domains, the catalytic domain and the large subunit C domain.

    Proteins where this domain is known:
    PFF0685c   


    SSF53927 - Cytidine_deaminase-like (Superfamily link)

    Interpro entry IPR016193 : Cytidine deaminase-like (Interpro link)

    Interpro description:

    This entry represents a structural domain with a core fold consisting of alpha-beta(2)-(alpha-beta)2 that folds into three layers, a/b/a; some members may contain an extra C-terminal strand. This domain is found in several types of proteins, including:

    Proteins where this domain is known:
    PF13_0259    PFL0230w   


    SSF54001 - SSF54001 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.310    MAL7P1.147    MAL8P1.157    PF10_0308    PF11_0161    PF11_0162    PF11_0165    PF11_0174    PF11_0177    PF11_0428    PF13_0096    PF14_0145    PF14_0553    PF14_0576    PF14_0687    PFA0220w    PFB0325c    PFB0330c    PFB0335c    PFB0340c    PFB0345c    PFB0350c    PFB0355c    PFB0360c    PFD0165w    PFD0230c    PFD0655c    PFE0835w    PFE1355c    PFI0135c    PFI0225w    PFI1135c    PFL1635w    PFL2290w   


    SSF54160 - Chromodomain-like (Superfamily link)

    Interpro entry IPR016197 : Chromo domain-like (Interpro link)

    Interpro description:

    This entry represents a chromo (CHRromatin Organization MOdifier) domain-like structural domain, which consists of an SH3-like beta-barrel capped by a C-terminal helix. Chromo domains are conserved modules of around 60 amino acids that are implicated in the recognition of lysine-methylated histone tails and nucleic acids. Chromo domains were originally identified in Drosophila modifiers of variegation, proteins that alter the structure of chromatin to the condensed morphology of heterochromatin. Domains with a chromo domain structural fold include:

    Chromo domains can be found in various nuclear proteins, including heterochromatin protein 1 (HP1) (N-terminal chromo domain and C-terminal chromo shadow domain), where the chromo domain recognises histone tails with specifically methylated lysines; polycomb protein Pc, which is essential for maintaining the silencing state of homeotic genes during development (chromo domain important for chromatin targeting); histone methyltransferase clr4, which regulates silencing and switching at the mating-type loci and to affect chromatin structure at centromeres; and the ATP-dependent helicase CHD1, which regulates ATP-dependent nucleosome assembly and mobilisation through conserved double chromo domains and a SWI2/SNF2 helicase/ATPase domain..

    Chromo barrel domains are found in various histone acetyltransferases, such as MYST1 from Mus musculus (Mouse) and MOF from Drosophila melanogaster (Fruit fly). This domain can also be found in the human mortality factor 4-like protein, MRG15.

    Proteins where this domain is known:
    PF10_0232    PF11_0418    PFL1005c   


    SSF54189 - L23_L15e_core (Superfamily link)

    Interpro entry IPR012678 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Both the L23 and L15e ribosomal proteins have a core domain consisting of a beta-(alpha)-beta-alpha-beta(2) structure folded into three layers, alpha/beta/alpha, where the beta-sheets are antiparallel.

    Proteins where this domain is known:
    PF13_0132    PFD0770c    PFL1895w   


    SSF54197 - His_triad-like_motif (Superfamily link)

    Interpro entry IPR011146 : Histidine triad-like motif (Interpro link)

    Interpro description:

    The histidine triad motif (HIT) is related to the sequence H-phi-H-phi-H-phi-phi (where phi is a hydrophobic amino acid). Proteins containing HIT domains form a superfamily of nucleotide hydrolases and transferases that act on the alpha-phosphate of ribonucleotides. HIT-containing proteins fall into three families:

    Proteins where this domain is known:
    PF08_0059    PF14_0349   


    SSF54211 - SSF54211 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.204    MAL13P1.243    MAL7P1.145    MAL7P1.66    PF07_0029    PF08_0076    PF10_0041    PF11_0184    PF11_0188    PF11_0382    PF13_0340    PF14_0132    PF14_0147    PF14_0256    PF14_0316    PF14_0417    PF14_0448    PF14_0486    PFB0415c    PFE0150c    PFF0115c    PFL1070c    PFL1590c    PFL1915w   


    SSF54236 - SSF54236 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.64    MAL8P1.122    MAL8P1.62    PF08_0067    PF10_0114    PF10_0193    PF11_0142    PF11_0329    PF13_0020    PF13_0084    PF13_0346    PF14_0027    PF14_0090    PFE0285c    PFE0380c    PFE1355c    PFI0335w    PFI1085w    PFI1680w    PFL0585w    PFL1830w   


    SSF54292 - Ferredoxin (Superfamily link)

    Interpro entry IPR001041 : Ferredoxin (Interpro link)

    Interpro description:

    The ferredoxin protein family are electron carrier proteins with an iron-sulphur cofactor that act in a wide variety of metabolic reactions. Ferredoxins can be divided into several subgroups depending upon the physiological nature of the iron-sulphur cluster(s) and according to sequence similarities.

    This entry represents members of the 2Fe-2S ferredoxin family that have a general core structure consisting of beta(2)-alpha-beta(2), which includes putidaredoxin and terpredoxin, and adrenodoxin. They are proteins of around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. This conserved region is also found as a domain in various metabolic enzymes and in multidomain proteins, such as aldehyde oxidoreductase (N-terminal), xanthine oxidase (N-terminal), phthalate dioxygenase reductase (C-terminal), succinate dehydrogenase iron-sulphur protein (N-terminal), and methane monooxygenase reductase (N-terminal).

    Proteins where this domain is known:
    MAL13P1.95    PFL0630w    PFL0705c   


    SSF54373 - SSF54373 (Superfamily link)

    Proteins where this domain is known:
    PF10_0275    PFC0275w    PFL0575w    PFL2060c   


    SSF54427 - SSF54427 (Superfamily link)

    Proteins where this domain is known:
    PF14_0122    PF14_0228   


    SSF54447 - ssDNA_bind_regul (Superfamily link)

    Interpro entry IPR009044 : ssDNA-binding transcriptional regulator (Interpro link)

    Interpro description:

    This entry represents a ssDNA-binding transcriptional regulator domain consisting of a helix-swapped dimer of beta(4)-alpha motifs. This domain is found as a C-terminal domain in the transciptional co-activator PC4 (where it is a dimer of two separate motifs), and in the plant transciprional regulator PBF-2 (where it is a single chain domain formed by a tandem repeat of two motifs).

    Transcriptional regulators play a critical role in controlling the level of transcription from specific genes in response to different stimuli. Members of this family of transcriptional regulators, which preferentially bind single-stranded DNA, include PBF-2 from plants, the mammalian nuclear factor 1-X (NF1-X), and positive cofactor 4 (PC4). These proteins are structurally similar, consisting of a helix-swapped dimer of beta(4)-alpha motifs.

    The plant defence transcription factor PBF-2 is comprised of four p24 subunits that interact through a helix-loop-helix motif to produce a central pore. PBF-2 functions as part of the plantÂs defence system in response to the detection of a pathogen. Upon stimulation, PBF-2 induces several signal transduction pathways leading to changes in the expression of defence genes, including the pathogenesis-related (PR) genes.

    NF1-X is one of several NF1 proteins that function as transcription factors. NF1-X consists of two functionally distinct domains: a conserved N-terminal DNA-binding domain and a C-terminal transcriptional regulatory domain. NF1-X binds to the promoter for the 3-hydroxy-3-methylglutaryl-CoA (HMG-CoA) reductase gene.

    PC4 (or P15) possess the ability to co-activate and suppress transcription via its DNA-binding activity. PC4 has been shown to stimulate transcription in vitro with diverse activators, including VP16, thyroid hormone receptor, BRCA-1, often involving TFIIA. PC4 and TFIIA are thought to facilitate the assembly of the pre-initiation complex. The repressive activity of PC4 can be alleviated by the transcription factor TFIIH, which protects promoters from PC4 repression. PC4 consists of two domains: an N-terminal regulatory domain and a C-terminal cryptic DNA-binding domain. The protein acts as a dimer with two ssDNA binding channels running in opposite directions to each other.

    Proteins where this domain is known:
    PFE1025c   


    SSF54495 - UBQ-conjugat/RWD-like (Superfamily link)

    Interpro entry IPR016135 : (Interpro link)

    Interpro description:

    This entry represents a structural domain with an alpha-beta(4)-alpha(3) core fold. Domains of this structure are found in:

    Proteins where this domain is known:
    MAL13P1.227    MAL8P1.41    PF08_0085    PF10_0330    PF13_0297    PF13_0301    PF14_0128    PF14_0264    PFC0255c    PFC0855w    PFE1350c    PFF0305c    PFI0740c    PFI1030c    PFL0190w    PFL2100w    PFL2175w   


    SSF54518 - Tubby_C (Superfamily link)

    Interpro entry IPR000007 : (Interpro link)

    Interpro description:

    Tubby, an autosomal recessive mutation, mapping to mouse chromosome 7, was recently found to be the result of a splicing defect in a novel gene with unknown function. This mutation maps to the tub gene. The mouse tubby mutation is the cause of maturity-onset obesity, insulin resistance and sensory deficits. By contrast with the rapid juvenile-onset weight gain seen in diabetes (db) and obese (ob) mice, obesity in tubby mice develops gradually, and strongly resembles the late-onset obesity observed in the human population. Excessive deposition of adipose tissue culminates in a two-fold increase of body weight. Tubby mice also suffer retinal degeneration and neurosensory hearing loss. The tripartite character of the tubby phenotype is highly similar to human obesity syndromes, such as Alstrom and Bardet-Biedl. Although these phenotypes indicate a vital role for tubby proteins, no biochemical function has yet been ascribed to any family member, although it has been suggested that the phenotypic features of tubby mice may be the result of cellular apoptosis triggered by expression of the mutated tub gene. TUB is the founding-member of the tubby-like proteins, the TULPs. TULPs are found in multicellular organisms from both the plant and animal kingdoms. Ablation of members of this protein family cause disease phenotypes that are indicative of their importance in nervous-system function and development.

    Mammalian TUB is a hydrophilic protein of ~500 residues. The N-terminal portion of the protein is conserved neither in length nor sequence, but, in TUB, contains the nuclear localisation signal and may have transcriptional-activation activity. The C-terminal 250 residues are highly conserved. The C-terminal extremity contains a cysteine residue that might play an important role in the normal functioning of these proteins. The crystal structure of the C-terminal core domain from mouse tubby has been determined to 1.9A resolution. This domain is arranged as a 12-stranded, all anti-parallel, closed beta-barrel that surrounds a central alpha helix, (which is at the extreme carboxyl terminus of the protein) that forms most of the hydrophobic core. Structural analyses suggest that TULPs constitute a unique family of bipartite transcription factors.

    Proteins where this domain is known:
    PF14_0058   


    SSF54529 - MAM33 (Superfamily link)

    Interpro entry IPR003428 : Mitochondrial glycoprotein (Interpro link)

    Interpro description:
    This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular 'heads' of C1Q.

    Proteins where this domain is known:
    PF14_0329   


    SSF54534 - SSF54534 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.340    MAL13P1.68    PFL2275c   


    SSF54565 - Ribosomal_S16 (Superfamily link)

    Interpro entry IPR000307 : Ribosomal protein S16 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S16 is one of the proteins from the small ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    S16 proteins have about 100 amino-acid residues.

    Proteins where this domain is known:
    PFE1560c   


    SSF54570 - Ribosomal_S19 (Superfamily link)

    Interpro entry IPR002222 : Ribosomal protein S19/S15 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    The small subunit ribosomal proteins can be categorised as: primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins. The small ribosomal subunit protein S19 contains 88-144 amino acid residues. In Escherichia coli, S19 is known to form a complex with S13 that binds strongly to 16S ribosomal RNA. Experimental evidence has revealed that S19 is moderately exposed on the ribosomal surface, and is designated a secondary rRNA binding protein. S19 belongs to a family of ribosomal proteins that includes: eubacterial S19; algal and plant chloroplast S19; cyanelle S19; archaebacterial S19; plant mitochondrial S19; and eukaryotic S15 ('rig' protein).

    Proteins where this domain is known:
    MAL13P1.92   


    SSF54575 - Ribosomal_L31e (Superfamily link)

    Interpro entry IPR000054 : Ribosomal protein L31e (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    A number of eukaryotic and archaebacterial large subunit ribosomal proteins can be grouped on the basis of sequence similarities. These proteins have 87 to 128 amino-acid residues. This family consists of:

  • Yeast L34
  • Archaeal L31
  • Plants L31
  • Mammalian L31
  • Proteins where this domain is known:
    PFE0185c   


    SSF54585 - Cdc48 domain 2-like (Superfamily link)

    Proteins where this domain is known:
    PF07_0047    PF10_0072    PFC0140c    PFF0940c   


    SSF54593 - SSF54593 (Superfamily link)

    Proteins where this domain is known:
    PF11_0145    PFF0230c   


    SSF54631 - SSF54631 (Superfamily link)

    Proteins where this domain is known:
    PFI1020c   


    SSF54637 - Thioesterase/thiol ester dehydrase-isomerase (Superfamily link)

    Proteins where this domain is known:
    PF11_0364    PF13_0128    PFB0835c   


    SSF54648 - Dynein_light_chain_typ-1/2 (Superfamily link)

    Interpro entry IPR001372 : Dynein light chain, type 1 and 2 (Interpro link)

    Interpro description:

    Dynein is a multisubunit microtubule-dependent motor enzyme that acts as the force generating protein of eukaryotic cilia and flagella. The cytoplasmic isoform of dynein acts as a motor for the intracellular retrograde motility of vesicles and organelles along microtubules.

    Dynein is composed of a number of ATP-binding large subunits, intermediate size subunits and small subunits. Among the small subunits, there is a family of highly conserved proteins which make up this family.

    Both type 1 (DLC1) and 2 (DLC2) dynein light chains have a similar two-layer alpha-beta core structure consisting of beta-alpha(2)-beta-X-beta(2).

    Proteins where this domain is known:
    MAL7P1.161    PF13_0306    PFL0660w   


    SSF54675 - SSF54675 (Superfamily link)

    Proteins where this domain is known:
    PFF1410c   


    SSF54686 - Ribosomal_L10e/L16 (Superfamily link)

    Interpro entry IPR016180 : Ribosomal protein L10e/L16 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents a structural domain with an alpha/beta-hammerhead fold, where the beta-hammerhead motif is similar to that in barrel-sandwich hybrids. Domains of this structure can be found in ribosomal proteins L10e and L16.

    Proteins where this domain is known:
    PF14_0041    PF14_0141   


    SSF54695 - BTB/POZ_fold (Superfamily link)

    Interpro entry IPR011333 : BTB/POZ fold (Interpro link)

    Interpro description:

    The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is a versatile protein-protein interaction motif involved in many cellular functions, including transcriptional regulation, cytoskeleton dynamics, ion channel assembly and gating, and targeting proteins for ubiquitination. The BTB domain can occur alongside other domains: BTB-zinc finger (BTB-ZF), BTB-BACK-Kelch (BBK), voltage-gated potassium channel T1 (T1-Kv), MATH-BTB, BTB-NPH3 and BTB-BACK-PHR (BBP). Other proteins, such as Skp1 and ElonginC, consist almost exclusively of the core BTB fold. In all of these protein families, the BTB core fold is structurally conserved, consisting of a 2-layer alpha/beta topology where a cluster of alpha helices is flanked by short beta-sheets. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN.

    This entry differs from IPR000210 in including POZ-containing Skp1 proteins.

    Proteins where this domain is known:
    MAL13P1.337    PF13_0238    PFL1875w   


    SSF54713 - EF_TS (Superfamily link)

    Interpro entry IPR014039 : Translation elongation factor EFTs/EF1B, dimerisation (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    This entry represents the C-terminal dimerisation domain found primarily in EF-Tu (EF1A) proteins from bacteria, mitochondria and chloroplasts.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PFC0225c   


    SSF54719 - SODismutase (Superfamily link)

    Interpro entry IPR001189 : Manganese and iron superoxide dismutase (Interpro link)

    Interpro description:

    Superoxide dismutases (SODs) catalyse the conversion of superoxide radicals to molecular oxygen. Their function is to destroy the radicals that are normally produced within cells and are toxic to biological systems. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. This family includes both single metal-binding SODs and cambialistic SOD, which can bind either Mn or Fe. Fe/MnSODs are ubiquitous enzymes that are responsible for the majority of SOD activity in prokaryotes, fungi, blue-green algae and mitochondria. Fe/MnSODs are found as homodimers or homotetramers.

    The structure of Fe/MnSODs can be divided into two domains, an alpha N-terminal domain and an alpha/beta C-terminal domain, connected by a loop. The structure of the N-terminal domain consists of a two helices in an antiparallel hairpin, with a left-handed twist. The structure of the C-terminal domain is of the alpha/beta type, and consists of a three-stranded antiparallel beta-sheet in the order 213, along with four helices in the arrangement alpha/beta(2)/alpha/beta/alpha(2).

    Proteins where this domain is known:
    PF08_0071    PFF1130c   


    SSF54736 - Ribosomal_L7/12_C/ClpS-like (Superfamily link)

    Interpro entry IPR014719 : (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    This entry represents a domain found at the C-terminus of ribosomal proteins L7 and L12, and also in the adaptor protein ClpS, forming an alpha/beta sandwich.

    The L7 and L12 ribosomal proteins are part of the large 50S ribosomal subunit, and occur in four copies organised as two dimers. The L7/L12 dimer probably interacts with EF-Tu. L7 and L12 only differ in a single post-translational modification of the addition of an acetyl group to the N terminus of L7.

    ClpS is an adaptor protein that influences protein degradation through its binding to the N-terminal domain of the chaperone ClpA in the ClpAP chaperone-protease pair. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins.

    Proteins where this domain is known:
    MAL13P1.111    PFB0545c    PFE1225w   


    SSF54747 - Ribosomal_L11 (Superfamily link)

    Interpro entry IPR000911 : Ribosomal protein L11 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L11 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L11 is known to bind directly to the 23S rRNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria, plant chloroplast, read algal chloroplast, cyanelle and archaeabacterial L11; and mammalian, plant and yeast L12 (YL15). L11 is a protein of 140 to 165 amino-acid residues. In E. coli, the C-terminal half of L11 has been shown to be in an extended and loosely folded conformation and is likely to be buried within the ribosomal structure.

    Proteins where this domain is known:
    PF11_0113    PFE0850c   


    SSF54762 - SRP9/14 (Superfamily link)

    Interpro entry IPR009018 : Signal recognition particle, SRP9/SRP14 subunit (Interpro link)

    Interpro description:

    The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.

    This entry represents both the 9 kDa SRP9 and the 14 kDa SRP14 components. Both SRP9 and SRP14 have the same (beta)-alpha-beta(3)-alpha fold. The heterodimer has pseudo two-fold symmetry and is saddle-like, consisting of a curved six-stranded beta-sheet that has four helices packed on the convex side and an exposed concave surface lined with positively charged residues. The SRP9/SRP14 heterodimer is essential for SRP RNA binding, mediating the pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP.

    Proteins where this domain is known:
    MAL7P1.158    PFL0160w   


    SSF54768 - SSF54768 (Superfamily link)

    Proteins where this domain is known:
    MAL7P1.66    PF14_0448   


    SSF54782 - Porphobil_deam (Superfamily link)

    Interpro entry IPR000860 : Tetrapyrrole biosynthesis, hydroxymethylbilane synthase (Interpro link)

    Interpro description:

    Tetrapyrroles are large macrocyclic compounds derived from a common biosynthetic pathway. The end-product, uroporphyrinogen III, is used to synthesise a number of important molecules, including vitamin B12, haem, sirohaem, chlorophyll, coenzyme F430 and phytochromobilin.

    The first stage in tetrapyrrole synthesis is the synthesis of 5-aminoaevulinic acid ALA via two possible routes: (1) condensation of succinyl CoA and glycine (C4 pathway) using ALA synthase, or (2) decarboxylation of glutamate (C5 pathway) via three different enzymes, glutamyl-tRNA synthetase to charge a tRNA with glutamate, glutamyl-tRNA reductase to reduce glutamyl-tRNA to glutamate-1-semialdehyde (GSA), and GSA aminotransferase to catalyse a transamination reaction to produce ALA.

    The second stage is to convert ALA to uroporphyrinogen III, the first macrocyclic tetrapyrrolic structure in the pathway. This is achieved by the action of three enzymes in one common pathway: porphobilinogen (PBG) synthase (or ALA dehydratase) to condense two ALA molecules to generate porphobilinogen; hydroxymethylbilane synthase (or PBG deaminase) to polymerise four PBG molecules into preuroporphyrinogen (tetrapyrrole structure); and uroporphyrinogen III synthase to link two pyrrole units together (rings A and D) to yield uroporphyrinogen III.

    Uroporphyrinogen III is the first branch point of the pathway. To synthesise cobalamin (vitamin B12), sirohaem, and coenzyme F430, uroporphyrinogen III needs to be converted into precorrin-2 by the action of uroporphyrinogen III methyltransferase. To synthesise haem and chlorophyll, uroporphyrinogen III needs to be decarboxylated into coproporphyrinogen III by the action of uroporphyrinogen III decarboxylase.

    This entry represents hydroxymethylbilane synthase (or porphobilinogen deaminase), which functions during the second stage of tetrapyrrole biosynthesis. This enzyme catalyses the polymerisation of four PBG molecules into the tetrapyrrole structure, preuroporphyrinogen, with the concomitant release of four molecules of ammonia. This enzyme uses a unique dipyrro-methane cofactor made from two molecules of PBG, which is covalently attached to a cysteine side chain. The tetrapyrrole product is synthesized in an ordered, sequential fashion, by initial attachment of the first pyrrole unit (ring A) to the cofactor, followed by subsequent additions of the remaining pyrrole units (rings B, C, D) to the growing pyrrole chain. The link between the pyrrole ring and the cofactor is broken once all the pyrroles have been added. This enzyme is folded into three distinct domains that enclose a single, large active site that makes use of an aspartic acid as its one essential catalytic residue, acting as a general acid/base during catalysis. A deficiency of hydroxymethylbilane synthase is implicated in the neuropathic disease, Acute Intermittent Porphyria (AIP).

    Proteins where this domain is known:
    PFL0480w   


    SSF54791 - SSF54791 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.295    PF10_0115    PF14_0151    PF14_0661    PFB0370c    PFC0130c    PFE0500c    PFF0250w    PFF1135w   


    SSF54814 - KH_prok (Superfamily link)

    Interpro entry IPR009019 : K Homology, prokaryotic type (Interpro link)

    Interpro description:

    The K homology domain is a common RNA-binding motif present in one or multiple copies in both prokaryotic and eukaryotic regulatory proteins. The KH motifs may act cooperatively to bind RNA in the case of multiple motifs, or independently in the case of single KH motif proteins. Prokaryotic (pKH) and eukaryotic (eKH) KH domains share a KH-motif, but have different topologies. The pKH domain has been found in a number of proteins, including the N-terminal domain of the S3 ribosomal protein, the C-terminal domain of Era GTPase and the two C-terminal domains of the NusA transcription factor. The structure of the pKH domain consists of a two-layer alpha/beta fold in the arrangement alpha/beta(2)/alpha/beta.

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    PF14_0339    PF14_0627   


    SSF54821 - Ribosomal_S3_C (Superfamily link)

    Interpro entry IPR001351 : Ribosomal protein S3, C-terminal (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S3 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S3 is known to be involved in the binding of initiator Met-tRNA. This family of ribosomal proteins includes S3 from bacteria, algae and plant chloroplast, cyanelle, archaebacteria, plant mitochondria, vertebrates, insects, Caenorhabditis elegans and yeast. This entry is the C-terminal domain.

    Proteins where this domain is known:
    PF14_0627   


    SSF54826 - SSF54826 (Superfamily link)

    Proteins where this domain is known:
    PF10_0155   


    SSF54843 - Ribosomal_L22 (Superfamily link)

    Interpro entry IPR001063 : Ribosomal protein L22/L17 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L22 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L22 is known to bind 23S rRNA. It belongs to a family of ribosomal proteins which includes: bacterial L22; algal and plant chloroplast L22 (in legumes L22 is encoded in the nucleus instead of the chloroplast); cyanelle L22; archaebacterial L22; mammalian L17; plant L17 and yeast YL17.

    Proteins where this domain is known:
    PF10_0105    PF13_0268    PF14_0642   


    SSF54849 - SSF54849 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.283    PF10_0153    PF11_0331    PFB0635w    PFC0285c    PFC0350c    PFC0900w    PFF0430w    PFL1425w    PFL1545c   


    SSF54862 - SSF54862 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.344   


    SSF54913 - N-reg_PII-like_a/b (Superfamily link)

    Interpro entry IPR011322 : (Interpro link)

    Interpro description:

    This entry represents a structural domain found in the nitrogen regulatory protein PII, in ATP phosphribosyltransferases (C-terminal domain), in the divalent ion tolerance protein CutA1, and in some bacterial hypothetical proteins. This domain consists of a ferredoxin-like alpha/beta sandwich, which forms trimeric structures with orthogonally packed beta-sheets around a three-fold axis.

    PII is a tetrameric protein encoded by glnB that functions as a component of the adenylation cascade involved in the regulation of GS activity. PII helps regulate the level of glutamine synthetase in response to nitrogen source availability. In nitrogen-limiting conditions, PII is uridylylated to form PII-UMP, which allows the deadenylation of glutamine synthetase, thus activating the enzyme. Conversely, in nitrogen excess, PI-UMP is deuridylated to PII, promoting the adenylation and deactivation of glutamine synthetase.

    ATP phosphoribosyltransferase is the first enzyme of the histidine pathway. It is allosterically regulated, controlling the flow of intermediates through the pathway. The C-terminal domain is the regulatory region of the protein, which binds the allosteric inhibitor histidine.

    CutA1 functions in divalent ion tolerance in bacteria, plants and animals. Divalent metal ions play key roles in all living organisms, serving as cofactors for many proteins involved in a variety of electron-transfer activities. In Escherichia coli it is thought to be involved in copper ion tolerance, excessive copper ions being toxic.

    Proteins where this domain is known:
    PFL2375c   


    SSF54919 - NDK (Superfamily link)

    Interpro entry IPR001564 : Nucleoside diphosphate kinase, core (Interpro link)

    Interpro description:

    Nucleoside diphosphate kinases (NDK) are enzymes required for the synthesis of nucleoside triphosphates (NTP) other than ATP. They provide NTPs for nucleic acid synthesis, CTP for lipid synthesis, UTP for polysaccharide synthesis and GTP for protein elongation, signal transduction and microtubule polymerization.

    In eukaryotes, there seems to be a small family of NDK isozymes each of which acts in a different subcellular compartment and/or has a distinct biological function. Eukaryotic NDK isozymes are hexamers of two highly related chains (A and B). By random association (A6, A5B...AB5, B6), these two kinds of chain form isoenzymes differing in their isoelectric point.

    NDK are proteins of 17 Kd that act via a ping-pong mechanism in which a histidine residue is phosphorylated, by transfer of the terminal phosphate group from ATP. In the presence of magnesium, the phosphoenzyme can transfer its phosphate group to any NDP, to produce an NTP.

    NDK isozymes have been sequenced from prokaryotic and eukaryotic sources. It has also been shown that the Drosophila awd (abnormal wing discs) protein, is a microtubule-associated NDK. Mammalian NDK is also known as metastasis inhibition factor nm23. The sequence of NDK has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. Our signature pattern contains this residue.

    Proteins where this domain is known:
    PF13_0349    PFF0275c   


    SSF54928 - SSF54928 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.120    MAL13P1.303    MAL13P1.338    MAL13P1.35    MAL7P1.126    MAL7P1.157a    MAL8P1.40    MAL8P1.83    PF07_0066    PF07_0083    PF08_0086    PF10_0028    PF10_0047    PF10_0068    PF10_0194    PF10_0214    PF10_0217    PF10_0235    PF10_0279    PF11_0083    PF11_0111    PF11_0200    PF11_0205    PF11_0279    PF11_0320    PF11_0330    PF11_0347    PF11_0402    PF13_0058    PF13_0098    PF13_0122    PF13_0147    PF13_0158    PF13_0165    PF13_0278    PF13_0315    PF13_0318    PF14_0028    PF14_0056    PF14_0057    PF14_0096    PF14_0194    PF14_0433    PF14_0513    PF14_0656    PFB0255w    PFC0130c    PFC0865w    PFD0700c    PFD0750w    PFD0775c    PFE0160c    PFE0750c    PFE0865c    PFE0885w    PFF0150c    PFF0300w    PFF0320c    PFF0505c    PFF0760w    PFF1125c    PFF1425w    PFI0820c    PFI1025w    PFI1175c    PFI1435w    PFI1600w    PFI1695c    PFL0375w    PFL0830w    PFL1170w    PFL1200c    PFL1705w    PFL1745c    PFL2130w    PFL2310w   


    SSF54980 - EFG_III_V (Superfamily link)

    Interpro entry IPR009022 : (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    EF2 (or EFG) participates in the elongation phase of protein synthesis by promoting the GTP-dependent translocation of the peptidyl tRNA of the nascent protein chain from the A-site (acceptor site) to the P-site (peptidyl tRNA site) of the ribosome. EF2 also has a role after the termination phase of translation, where, together with the ribosomal recycling factor, it facilitates the release of tRNA and mRNA from the ribosome, and the splitting of the ribosome into two subunits. EF2 is folded into five domains, with domains I and II forming the N-terminal block, domains IV and V forming the C-terminal block, and domain III providing the covalently-linked flexible connection between the two. Domains III and V have the same fold (although they are not completely superimposable and domain III lacks some of the superfamily characteristics), consisting of an alpha/beta sandwich with an antiparallel beta-sheet in a (beta/alpha/beta)x2 topology. This double split beta/alpha/beta fold is also seen in a number of ribonucleotide binding proteins. It is the most common motif occurring in the translation system and is referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif.

    This domain is found in EF2 proteins from both prokaryotes and eukaryotes, as well as in some tetracycline resistance proteins, peptide chain release factors, and in the C-terminal region of the bacterial hypothetical protein, YigZ.

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    MAL13P1.243    PF07_0062    PF10_0041    PF14_0486    PFF0115c    PFI0570w    PFL1590c    PFL1710c   


    SSF54984 - Transl_elong_EF1B_B/D_G_exch (Superfamily link)

    Interpro entry IPR014038 : Translation elongation factor EF1B, beta and delta chains, guanine nucleotide exchange (Interpro link)

    Interpro description:

    Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome. EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution.

    Elongation factor EF1B (also known as EF-Ts or EF-1beta/gamma/delta) is a nucleotide exchange factor that is required to regenerate EF1A from its inactive form (EF1A-GDP) to its active form (EF1A-GTP). EF1A is then ready to interact with a new aminoacyl-tRNA to begin the cycle again. EF1B is more complex in eukaryotes than in bacteria, and can consist of three subunits: EF1B-alpha (or EF-1beta), EF1B-gamma (or EF-1gamma) and EF1B-beta (or EF-1delta).

    This entry represents the guanine nucleotide exchange domain of the beta (EF-1beta, also known as EF1B-alpha) and delta (EF-1delta, also known as EF1B-beta) chains of EF1B proteins from eukaryotes and archaea. The beta and delta chains have exchange activity, which mainly resides in their homologous guanine nucleotide exchange domains, found in the C-terminal region of the peptides. Their N-terminal regions may be involved in interactions with the gamma chain (EF-1gamma).

    More information about these proteins can be found at Protein of the Month: Elongation Factors.

    Proteins where this domain is known:
    PFC0870w    PFI0645w   


    SSF54991 - Fdx_AntiC_bd (Superfamily link)

    Interpro entry IPR005121 : Phenylalanyl-tRNA synthetase, beta subunit, ferrodoxin-fold anticodon-binding (Interpro link)

    Interpro description:

    This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold, consisting of an alpha+beta sandwich with anti-parallel beta-sheets (beta-alpha-beta x2).

    Proteins where this domain is known:
    PFF0180w   


    SSF54995 - Ribosomal_S6 (Superfamily link)

    Interpro entry IPR000529 : Ribosomal protein S6 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S6 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S6 is known to bind together with S18 to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacterial, red algal chloroplast and cyanelle S6 ribosomal proteins.

    Proteins where this domain is known:
    PF14_0606    PFI1585c   


    SSF54999 - Ribosomal_S10 (Superfamily link)

    Interpro entry IPR001848 : Ribosomal protein S10 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Evidence suggests that, in prokaryotes, the peptidyl transferase reaction is performed by the large subunit 23S rRNA, whereas proteins probably have a greater role in eukaryotic ribosomes. Most of the proteins lie close to, or on the surface of, the 30S subunit, arranged peripherally around the rRNA. The small subunit ribosomal proteins can be categorised as primary binding proteins, which bind directly and independently to 16S rRNA; secondary binding proteins, which display no specific affinity for 16S rRNA, but its assembly is contingent upon the presence of one or more primary binding proteins; and tertiary binding proteins, which require the presence of one or more secondary binding proteins and sometimes other tertiary binding proteins.

    The small ribosomal subunit protein S10 consists of about 100 amino acid residues. In Escherichia coli, S10 is involved in binding tRNA to the ribosome, and also operates as a transcriptional elongation factor. Experimental evidence has revealed that S10 has virtually no groups exposed on the ribosomal surface, and is one of the "split proteins": these are a discrete group that are selectively removed from 30S subunits under low salt conditions and are required for the formation of activated 30S reconstitution intermediate (RI*) particles. S10 belongs to a family of proteins that includes: bacteria S10; algal chloroplast S10; cyanelle S10; archaebacterial S10; Marchantia polymorpha and Prototheca wickerhamii mitochondrial S10; Arabidopsis thaliana mitochondrial S10 (nuclear encoded); vertebrate S20; plant S20; and yeast URP2.

    Proteins where this domain is known:
    PF10_0038    PF14_0581   


    SSF55003 - PAP_C (Superfamily link)

    Interpro entry IPR011068 : Nucleotidyltransferase, class I, C-terminal-like (Interpro link)

    Interpro description:

    Nucleotidytransferases can be divided into two classes based on highly conserved features of the nucleotidyltransferase motif. Class I enzymes include eukaryotic poly(A) polymerase (PAP), archaeal tRNA CCA-adding enzyme and possibly DNA polymerase beta, while class II enzymes include eukaryotic and eubacterial tRNA CCA-adding enzymes. This entry represents the C-terminal domain of class I nucleotidyltransferases. The C-terminal domain has an alpha/beta sandwich fold, although the archaeal tRNA CCA-adding enzyme has a large insertion; this fold is reminiscent of the RNA-recognition motif fold.

    Poly(A) polymerase, the enzyme at the heart of the polyadenylation machinery, is a template-independent RNA polymerase that specifically incorporates ATP at the 3' end of mRNA. In eukaryotes, polyadenylation of pre-mRNA plays an essential role in the initiation step of protein synthesis, as well as in the export and stability of mRNAs. The catalytic domain of poly(A) polymerase shares substantial structural homology with other nucleotidyl transferases such as DNA polymerase beta and kanamycin transferase. The three invariant aspartates of the catalytic triad ligate two of the three active site metals. One of these metals also contacts the adenine ring. Furthermore, conserved, catalytically important residues contact the nucleotide. These contacts, taken together with metal coordination of the adenine base, provide a structural basis for ATP selection by poly(A) polymerase.

    The archaeal CCA-adding enzyme builds and repairs the 3 ' end of tRNA. A single active site (nucleotidyltransferase motif) adds both CTP and ATP. This enzyme is the only RNA polymerase that can build or rebuild a specific nucleic acid sequence without using a nucleic acid template.

    Proteins where this domain is known:
    PFF1240w   


    SSF55060 - SSF55060 (Superfamily link)

    Proteins where this domain is known:
    PFE0150c   


    SSF55073 - A/G_cyclase (Superfamily link)

    Interpro entry IPR001054 : Adenylyl cyclase class-3/4/guanylyl cyclase (Interpro link)

    Interpro description:

    Guanylate cyclases catalyse the formation of cyclic GMP (cGMP) from GTP. cGMP acts as an intracellular messenger, activating cGMP-dependent kinases and regulating cGMP-sensitive ion channels. The role of cGMP as a second messenger in vascular smooth muscle relaxation and retinal photo-transduction is well established. Guanylate cyclase is found both in the soluble and particulate fractions of eukaryotic cells. The soluble and plasma membrane-bound forms differ in structure, regulation and other properties. Most currently known plasma membrane-bound forms are receptors for small polypeptides. The soluble forms of guanylate cyclase are cytoplasmic heterodimers having alpha and beta subunits.

    In all characterised eukaryote guanylyl- and adenylyl cyclases, cyclic nucleotide synthesis is carried out by the conserved class III cyclase domain.

    Proteins where this domain is known:
    MAL13P1.301    MAL8P1.150    PF11_0395    PF14_0043   


    SSF55120 - SSF55120 (Superfamily link)

    Proteins where this domain is known:
    PF07_0125    PF08_0123    PF10_0175    PF10_0341    PF14_0174    PFB0530c    PFB0890c    PFE0570w    PFE0815w    PFE1080w    PFI0420c    PFI0685w    PFL1350w    PFL1380w   


    SSF55129 - Ribosomal_L30 (Superfamily link)

    Interpro entry IPR016082 : Ribosomal protein L30, ferredoxin-like fold domain (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L30 is one of the proteins from the large ribosomal subunit. L30 belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups bacteria and archaea L30, yeast mitochondrial L33, and Drosophila melanogaster, Dictyostelium discoideum (Slime mold), fungal and mammalian L7 ribosomal proteins. L30 from bacteria are small proteins of about 60 residues, those from archaea are proteins of about 150 residues, and eukaryotic L7 are proteins of about 250 to 270 residues.

    This entry represents a domain with a ferredoxin-like fold, with a core structure consisting of core: beta-alpha-beta-alpha-beta. This domain is found in prokaryotic ribosomal protein L30 (short-chain member of the family), as well as in archaeal L30 (L30a) (long-chain member of the family), the later containing an additional C-terminal (sub)domain).

    Proteins where this domain is known:
    MAL13P1.272    PFC0300c   


    SSF55154 - mRNA_capping_enz_bsu (Superfamily link)

    Interpro entry IPR004206 : mRNA capping enzyme, beta subunit (Interpro link)

    Interpro description:
    The mRNA capping enzyme in yeasts is composed of two separate subunits, a mRNA guanyltransferase and an RNA 5'-triphosphate. This is the beta subunit of mRNA capping enzyme which has triphosphatase activity. The beta chain (polynucleotide 5'-phosphatase converts the 5'-triphosphate end of a nascent mRNA chain into a diphosphate in the first step of mRNA capping. The function of the capping enzyme also depends on the guanylyltransferase activity conferred by the alpha chain.

    Proteins where this domain is known:
    PFC0980c   


    SSF55159 - TIF_SUI1 (Superfamily link)

    Interpro entry IPR001950 : Translation initiation factor SUI1 (Interpro link)

    Interpro description:
    In Saccharomyces cerevisiae (Baker's yeast), SUI1 is a translation initiation factor that functions in concert with eIF-2 and the initiator tRNA-Met in directing the ribosome to the proper start site of translation. SUI1 is a protein of 108 residues. Close homologs of SUI1 have been found in mammals, insects and plants. SUI1 is also evolutionary related to hypothetical proteins from Escherichia coli (yciH), Haemophilus influenzae (HI1225) and Methanococcus vannielii.

    Proteins where this domain is known:
    PF08_0079    PFI0365w    PFL2095w   


    SSF55174 - SSF55174 (Superfamily link)

    Proteins where this domain is known:
    PF11_0181    PF14_0584    PFB0890c    PFE1005w    PFL1380w   


    SSF55186 - Thr/Ala-tRNA-synth_IIc_edit (Superfamily link)

    Interpro entry IPR018163 : Threonyl/alanyl tRNA synthetase, class II-like, putative editing domain (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    This entry represents a structural domain containing a two-layer core alpha/beta structure: alpha-beta(2)-alpha-beta(2). This domain is thought to be a putative editing domain found in the N-terminal part of threonyl-tRNA synthetase (ThrRS), the C-terminal of alanyl-tRNA synthetase (AlaRS), and as the stand-alone hypothetical proteinfrom the archaea Pyrococcus horikoshii; probable circular permutation of LuxS.

    Proteins where this domain is known:
    PF11_0270    PF13_0205    PF13_0354   


    SSF55190 - Arg-tRNA-synth_Ic_N (Superfamily link)

    Interpro entry IPR005148 : Arginyl tRNA synthetase, class Ic, N-terminal (Interpro link)

    Interpro description:

    The aminoacyl-tRNA synthetases catalyse the attachment of an amino acid to its cognate transfer RNA molecule in a highly specific two-step reaction. These proteins differ widely in size and oligomeric state, and have limited sequence homology. The 20 aminoacyl-tRNA synthetases are divided into two classes, I and II. Class I aminoacyl-tRNA synthetases contain a characteristic Rossman fold catalytic domain and are mostly monomeric. Class II aminoacyl-tRNA synthetases share an anti-parallel beta-sheet fold flanked by alpha-helices, and are mostly dimeric or multimeric, containing at least three conserved regions. However, tRNA binding involves an alpha-helical structure that is conserved between class I and class II synthetases. In reactions catalysed by the class I aminoacyl-tRNA synthetases, the aminoacyl group is coupled to the 2'-hydroxyl of the tRNA, while, in class II reactions, the 3'-hydroxyl site is preferred. The synthetases specific for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan and valine belong to class I synthetases; these synthetases are further divided into three subclasses, a, b and c, according to sequence homology. The synthetases specific for alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and threonine belong to class-II synthetases.

    This domain is found at the N-terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition.

    Proteins where this domain is known:
    PFL0900c   


    SSF55194 - Ribosome_recyc_fac (Superfamily link)

    Interpro entry IPR002661 : Ribosome recycling factor (Interpro link)

    Interpro description:

    The ribosome recycling factor or ribosome release factor (RRF) dissociates ribosomes from mRNA after termination of translation, and is essential for bacterial growth. Thus ribosomes are 'recycled' and ready for another round of protein synthesis.

    Proteins where this domain is known:
    PFB0390w    PFD0990w   


    SSF55200 - IF3 (Superfamily link)

    Interpro entry IPR001288 : Initiation factor 3 (Interpro link)

    Interpro description:

    Initiation factor 3 (IF-3) (gene infC) is one of the three factors required for the initiation of protein biosynthesis in bacteria. IF-3 is thought to function as a fidelity factor during the assembly of the ternary initiation complex which consist of the 30S ribosomal subunit, the initiator tRNA and the messenger RNA. IF-3 is a basic protein that binds to the 30S ribosomal subunit. The chloroplast initiation factor IF-3(chl) is a protein that enhances the poly(A,U,G)-dependent binding of the initiator tRNA to chloroplast ribosomal 30s subunits in which the central section is evolutionary related to the sequence of bacterial IF-3.

    Proteins where this domain is known:
    MAL8P1.27   


    SSF55205 - RNA3'_cycl/enolpyr_transf_A/B (Superfamily link)

    Interpro entry IPR013792 : RNA 3'-terminal phosphate cyclase/enolpyruvate transferase, alpha/beta (Interpro link)

    Interpro description:

    This entry represents an alpha/beta domain consisting of alternating beta-strands and alpha helices in two layer. This domain is found in RNA 3'-terminal phosphate cyclase (RPTC), where it occurs as a duplication of three repeats of this fold packed together around a pseudo three-fold axis. RNA cyclases are a family of RNA-modifying enzymes that catalyse the ATP-dependent conversion of the 3'-phosphate to the 2',3'-cyclic phosphodiester at the end of RNA. These cyclases contain an insert alpha/beta domain with a thioredoxin topology.

    This domain is also found in enolpyruvate transferase, where it occurs as a duplication of six repeats of this fold organised into two RPTC-like domains. Enolpyruvate transferase is the first enzyme in bacterial peptidoglycan biosynthesis, catalysing the transfer of enolpyruvate from phosphoenolpyruvate to UDP-N-acetyl-glucosamine.

    Proteins where this domain is known:
    PF14_0677    PFB0280w   


    SSF55248 - Trans_pterinDh (Superfamily link)

    Interpro entry IPR001533 : Transcriptional coactivator/pterin dehydratase (Interpro link)

    Interpro description:

    DCoH is the dimerisation cofactor of hepatocyte nuclear factor 1 (HNF-1) that functions as both a transcriptional coactivator and a pterin dehydratase. X-ray crystallographic studies have shown that the ligand binds at four sites per tetrameric enzyme, with little apparent conformational change in the protein.

    Proteins where this domain is known:
    PF11_0095a   


    SSF55257 - RNAP_RBP11-like (Superfamily link)

    Interpro entry IPR009025 : DNA-directed RNA polymerase, RBP11-like (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and the prokaryotic RNAP alpha subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/betaÂ) can interact. These different subunits share regions of homology. Rpb11 contains a domain (Rpb11-like domain) that is required for dimerisation, and binds to a homologous region on Rpb3. The Rpb11-like domain in Rpb11 and archaeal L subunits is contiguous, whereas in Rpb3, archaeal D, and prokaryotic alpha subunits, the Rpb11-like domain is interrupted by an insert domain. In the prokaryotic RNAP alpha subunit, the Rpb11-like domain and the insert domain form two subregions of the N-terminal domain.

    The structure of the Rpb11-like domain consists of a two-layer alpha/beta fold consisting of beta(2)-alpha-beta(2)-alpha. Rpb3 and Rpb11 in yeast RNAP have been shown to share a high degree of sequence and structural similarity to the alpha subunit of bacterial RNAP.

    Proteins where this domain is known:
    PF11_0445    PF13_0023    PF13_0040    PF14_0150    PF14_0695    PFI1130c   


    SSF55271 - DNA_mismatch_repair_MutS_N (Superfamily link)

    Interpro entry IPR016151 : DNA mismatch repair protein MutS, N-terminal (Interpro link)

    Interpro description:

    Mismatch repair contributes to the overall fidelity of DNA replication and is essential for combating the adverse effects of damage to the genome. It involves the correction of mismatched base pairs that have been missed by the proofreading element of the DNA polymerase complex. The post-replicative Mismatch Repair System (MMRS) of Escherichia coli involves MutS (Mutator S), MutL and MutH proteins, and acts to correct point mutations or small insertion/deletion loops produced during DNA replication. MutS and MutL are involved in preventing recombination between partially homologous DNA sequences. The assembly of MMRS is initiated by MutS, which recognises and binds to mispaired nucleotides and allows further action of MutL and MutH to eliminate a portion of newly synthesized DNA strand containing the mispaired base. MutS can also collaborate with methyltransferases in the repair of O(6)-methylguanine damage, which would otherwise pair with thymine during replication to create an O(6)mG:T mismatch. MutS exists as a dimer, where the two monomers have different conformations and form a heterodimer at the structural level. Only one monomer recognises the mismatch specifically and has ADP bound. Non-specific major groove DNA-binding domains from both monomers embrace the DNA in a clamp-like structure. Mismatch binding induces ATP uptake and a conformational change in the MutS protein, resulting in a clamp that translocates on DNA.

    MutS is a modular protein with a complex structure, and is composed of:

    Homologues of MutS have been found in many species including eukaryotes (MSH 1, 2, 3, 4, 5, and 6 proteins), archaea and bacteria, and together these proteins have been grouped into the MutS family. Although many of these proteins have similar activities to the E. coli MutS, there is significant diversity of function among the MutS family members. This diversity is even seen within species, where many species encode multiple MutS homologues with distinct functions. Inter-species homologues may have arisen through frequent ancient horizontal gene transfer of MutS (and MutL) from bacteria to archaea and eukaryotes via endosymbiotic ancestors of mitochondria and chloroplasts.

    This entry represents the N-terminal domain of proteins in the MutS family of DNA mismatch repair proteins. The N-terminal domain of MutS is responsible for mismatch recognition and forms a 6-stranded mixed beta-sheet surrounded by three alpha-helices, which is similar to the structure of tRNA endonuclease.

    Proteins where this domain is known:
    PF14_0051    PFE0270c   


    SSF55277 - GYF (Superfamily link)

    Interpro entry IPR003169 : (Interpro link)

    Interpro description:

    The glycine-tyrosine-phenylalanine (GYF) domain is an around 60-amino acid domain which contains a conserved GP[YF]xxxx[MV]xxWxxx[GN]YF motif. It was identified in the human intracellular protein termed CD2 binding protein 2 (CD2BP2), which binds to a site containing two tandem PPPGHR segments within the cytoplasmic region of CD2. Binding experiments and mutational analyses have demonstrated the critical importance of the GYF tripeptide in ligand binding. A GYF domain is also found in several other eukaryotic proteins of unknown function . It has been proposed that the GYF domain found in these proteins could also be involved in proline-rich sequence recognition. Resolution of the structure of the CD2BP2 GYF domain by NMR spectroscopy revealed a compact domain with a beta-beta-alpha-beta-beta topology, where the single alpha-helix is tilted away from the twisted, anti-parallel beta-sheet. The conserved residues of the GYF domain create a contiguous patch of predominantly hydrophobic nature which forms an integral part of the ligand-binding site. There is limited homology within the C-terminal 20-30 amino acids of various GYF domains, supporting the idea that this part of the domain is structurally but not functionally important.

    Proteins where this domain is known:
    PF10_0183    PF10_0310    PF13_0067    PFF0220w   


    SSF55282 - Ribosomal_L5 (Superfamily link)

    Interpro entry IPR002132 : Ribosomal protein L5 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein L5 is one of the proteins from the large ribosomal subunit. In Escherichia coli, L5 is known to be involved in binding 5S RNA to the large ribosomal subunit. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups:

    L5 is a protein of about 180 amino-acid residues.

    Proteins where this domain is known:
    PF07_0079   


    SSF55287 - RNApol_RPB5_like (Superfamily link)

    Interpro entry IPR000783 : RNA polymerase, subunit H/Rpb5 C-terminal (Interpro link)

    Interpro description:

    Prokaryotes contain a single DNA-dependent RNA polymerase (RNAP; that is responsible for the transcription of all genes, while eukaryotes have three classes of RNAPs (I-III) that transcribe different sets of genes. Each class of RNA polymerase is an assemblage of ten to twelve different polypeptides. Certain subunits of RNAPs, including RPB5 (POLR2E in mammals), are common to all three eukaryotic polymerases. RPB5 plays a role in the transcription activation process. Eukaryotic RPB5 has a bipartite structure consisting of a unique N-terminal region, plus a C-terminal region that is structurally homologous to the prokaryotic RPB5 homologue, subunit H (gene rpoH).

    This entry represents prokaryotic subunit H and the C-terminal domain of eukaryotic RPB5, which share a two-layer alpha/beta fold, with a core structure of beta/alpha/beta/alpha/beta(2).

    Proteins where this domain is known:
    PF13_0341   


    SSF55307 - Tub_FtsZ_C (Superfamily link)

    Interpro entry IPR008280 : Tubulin/FtsZ, C-terminal (Interpro link)

    Interpro description:

    This domain is found in the tubulin alpha, beta and gamma chains, as well as the bacterial FtsZ family of proteins. These proteins are GTPases and are involved in polymer formation. Tubulin is the major component of microtubules, while FtsZ is the polymer-forming protein of bacterial cell division, it is part of a ring in the middle of the dividing cell that is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerise into tubes, sheets, and rings in vitro and is ubiquitous in bacteria and archaea. This is the C-terminal domain.

    Proteins where this domain is known:
    PF08_0125    PF10_0084    PF14_0725    PFD1050w    PFI0180w    PFI1635w   


    SSF55315 - SSF55315 (Superfamily link)

    Proteins where this domain is known:
    MAL7P1.118    PF10_0187    PF11_0250    PF14_0231    PFB0550w    PFB0855c    PFC0295c    PFC0405c    PFD0960c   


    SSF55326 - PurM_N-like (Superfamily link)

    Interpro entry IPR016188 : PurM, N-terminal-like (Interpro link)

    Interpro description:

    This entry represents a structural domain with a core structure consisting of beta-alpha-beta-alpha-beta(2), which is found in two enzymes of the purine biosynthetic pathway: at the N-terminal of aminoimidazole ribonucleotide (AIR) synthetase (PurM), as well as the N1 and N2 domains of formylglycinamide ribonucleotide (FGAR) amidotransferase (PurL) (PurM-like module). PurM and PurL utilise ATP to activate the oxygen of an amide within their substrate toward nucleophilic attack by a nitrogen. PurM uses the product of PurL, formylglycinamidine ribonucleotide (FGAM) and ATP to make AIR, ADP and P(i). It is also found as domains 1 and 3 in phosphoribosylformylglycinamidine synthase II (smPurL) (carries a duplication: tandem repeats of two PurM-like units arranged like the PurM subunits in the dimer).

    This domain is also found at the N-terminal of thiamine monophosphate kinase (ThiL). ThiL phosphorylates thiamin monophosphate to form thiamin pyrophosphate, an essential cofactor that is synthsised de novo by Salmonella typhimurium.

    Proteins where this domain is known:
    PFI0505c   


    SSF55331 - Tautomerase/MIF (Superfamily link)

    Proteins where this domain is known:
    PFL1420w   


    SSF55347 - SSF55347 (Superfamily link)

    Proteins where this domain is known:
    PF14_0511    PF14_0598    PF14_0641    PFE0585c   


    SSF55418 - TIF_eIF_4E (Superfamily link)

    Interpro entry IPR001040 : Eukaryotic translation initiation factor 4E (eIF-4E) (Interpro link)

    Interpro description:
    Eukaryotic translation initiation factor 4E (eIF-4E) is a protein that binds to the cap structure of eukaryotic cellular mRNAs. eIF-4E recognises and binds the 7-methylguanosine-containing (m7Gppp) cap during an early step in the initiation of protein synthesis and facilitates ribosome binding to a mRNA by inducing the unwinding of its secondary structures. A tryptophan in the central part of the sequence of human eIF-4E seems to be implicated in cap-binding.

    Proteins where this domain is known:
    PFA0570w    PFC0635c   


    SSF55424 - FAD/NAD-linked_reductase_dimer (Superfamily link)

    Interpro entry IPR016156 : FAD/NAD-linked reductase, dimerisation (Interpro link)

    Interpro description:

    This entry represents a dimerisation domain that is usually found at the C-terminal of FAD and NAD-linked reductases. This domain has a core alpha+beta sandwich structure consisting of beta(3,4)-alpha(3). The first two domains are of the same beta/beta/alpha fold. This domain can be found in the following proteins:

    Proteins where this domain is known:
    PF07_0085    PF08_0066    PF14_0192    PFI1170c    PFL1550w   


    SSF55481 - SSF55481 (Superfamily link)

    Proteins where this domain is known:
    PFB0550w   


    SSF55486 - SSF55486 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.184    MAL13P1.56    PF10_0058    PF14_0692    PFD0980w   


    SSF55620 - SSF55620 (Superfamily link)

    Proteins where this domain is known:
    PFF1360w    PFL1155w   


    SSF55637 - Cyc_dep_kin_rsub (Superfamily link)

    Interpro entry IPR000789 : Cyclin-dependent kinase, regulatory subunit (Interpro link)

    Interpro description:

    Protein kinases are a group of enzymes that possess a catalytic subunit which transfers the gamma phosphate from nucleotide triphosphates (often ATP) to one or more amino acid residues in a protein substrate side chain, resulting in a conformational change affecting protein function. The enzymes fall into two broad classes, characterised with respect to substrate specificity: serine/threonine specific and tyrosine specific.

    Protein kinase function has been evolutionarily conserved from Escherichia coli to human. Protein kinases play a role in a mulititude of cellular processes, including division, proliferation, apoptosis, and differentiation. Phosphorylation usually results in a functional change of the target protein by changing enzyme activity, cellular location, or association with other proteins.

    The catalytic subunits of protein kinases are highly conserved, and several structures have been solved, leading to large screens to develop kinase-specific inhibitors for the treatments of a number of diseases.

    In eukaryotes, cyclin-dependent protein kinases interact with cyclins to regulate cell cycle progression, and are required for the G1 and G2 stages of cell division. The proteins bind to a regulatory subunit, cyclin-dependent kinase regulatory subunit (CKS), which is essential for their function. This regulatory subunit is a small protein of 79 to 150 residues. In yeast (gene CKS1) and in fission yeast (gene suc1) a single isoform is known, while mammals have two highly related isoforms. The regulatory subunits exist as hexamers, formed by the symmetrical assembly of 3 interlocked homodimers, creating an unusual 12-stranded beta-barrel structure. Through the barrel centre runs a 12A diameter tunnel, lined by 6 exposed helix pairs. Six kinase units can be modelled to bind the hexameric structure, which may thus act as a hub for cyclin-dependent protein kinase multimerisation.

    Proteins where this domain is known:
    PFA0285c    PFI1155w   


    SSF55658 - L9_N_like (Superfamily link)

    Interpro entry IPR009027 : (Interpro link)

    Interpro description:

    The N-terminal domain of the ribosomal protein L9 is a regulatory RNA-binding module that binds to 23rRNA. L9 is composed of two domains and functions as a structural protein in the large subunit of the ribosome.

    The N-terminal domain of eukaryotic RNase HI, which is lacking in retroviral and prokaryotic enzymes, shows a striking structural similarity to the L9 N-terminal domain, and may also function as a regulatory RNA-binding module. Eukaryotic RNases HI possess either one or two copies of the small N-terminal domain, in addition to the well-conserved catalytic RNase H domain. RNase HI belongs to the family of ribonuclease H enzymes that recognise RNA:DNA hybrids and degrade the RNA component.

    The structures of both the L9 and the RNase HI N-terminal domains consist of a three-stranded antiparallel beta-sheet sandwiched between two short alpha-helices. The hydrophobic core of the domain is formed by the conserved residues that are involved in the packing of the alpha-helices onto the beta-sheet. The (beta)2/alpha/beta/alpha topology of the domain differs from the structures of known RNA binding domains such as the double-stranded RNA binding domain (dsRBD), the hnRNP K homology (KH) domain and the RNP motif.

    Proteins where this domain is known:
    MAL13P1.318   


    SSF55666 - 3_ExoRNase (Superfamily link)

    Interpro entry IPR015847 : Exoribonuclease, phosphorolytic domain 2 (Interpro link)

    Interpro description:

    The PH (phosphorolytic) domain is responsible for 3'-5' exoribonuclease activity, although in some proteins this domain has lost its catalytic function. An active PH domain uses inorganic phosphate as a nucleophile, adding it across the phosphodiester bond between the end two nucleotides in order to release ribonucleoside 5'-diphosphate (rNDP) from the 3' end of the RNA substrate.

    PH domains can be found in bacterial/organelle RNases and PNPases (polynucleotide phosphorylases), as well as in archaeal and eukaryotic RNA exosomes, the later acting as nano-compartments for the degradation or processing of RNA (including mRNA, rRNA, snRNA and snoRNA). Bacterial/organelle PNPases share a common barrel structure with RNA exosomes, consisting of a hexameric ring of PH domains that act as a degradation chamber, and an S1-domain/KH-domain containing cap that binds the RNA substrate (and sometimes accessory proteins) in order to regulate and restrict entry into the degradation chamber . Unstructured RNA substrates feed in through the pore made by the S1 domains, are degraded by the PH domain ring, and exit as nucleotides via the PH pore at the opposite end of the barrel.

    This entry represents the phosphorolytic (PH) domain 2, which has a core 3-layer alpha/beta/alpha structure. This domain is found in bacterial/organelle PNPases and in archaeal/eukaryotic exosomes..

    More information about these proteins can be found at Protein of the Month: RNA Exosomes.

    Proteins where this domain is known:
    MAL13P1.204    PF13_0340    PF14_0256   


    SSF55681 - SSF55681 (Superfamily link)

    Proteins where this domain is known:
    PF07_0073    PF10_0409    PF11_0051    PF11_0270    PF13_0262    PF13_0354    PF14_0166    PF14_0198    PF14_0428    PF14_0573    PFA0145c    PFA0480w    PFB0525w    PFE0475w    PFE0715w    PFF0180w    PFI1240c    PFI1645c    PFL0670c    PFL0770w    PFL1540c   


    SSF55711 - AP2_adap_app (Superfamily link)

    Interpro entry IPR009028 : Clathrin/coatomer adaptor, adaptin-like, appendage, C-terminal subdomain (Interpro link)

    Interpro description:

    Proteins synthesized on the ribosome and processed in the endoplasmic reticulum are transported from the Golgi apparatus to the trans-Golgi network (TGN), and from there via small carrier vesicles to their final destination compartment. This traffic is bidirectional, to ensure that proteins required to form vesicles are recycled. Vesicles have specific coat proteins (such as clathrin or coatomer) that are important for cargo selection and direction of transfer.

    Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. The two major types of clathrin adaptor complexes are the heterotetrameric adaptor protein (AP) complexes, and the monomeric GGA (Golgi-localising, Gamma-adaptin ear domain homology, ARF-binding proteins) adaptors. All AP complexes are heterotetramers composed of two large subunits (adaptins), a medium subunit (mu) and a small subunit (sigma). Each subunit has a specific function. Adaptin subunits recognise and bind to clathrin through their hinge region (clathrin box), and recruit accessory proteins that modulate AP function through their C-terminal appendage domains. By contrast, GGAs are monomers composed of four domains, which have functions similar to AP subunits: an N-terminal VHS (Vps27p/Hrs/Stam) domain, a GAT (GGA and Tom1) domain, a hinge region, and a C-terminal GAE (gamma-adaptin ear) domain. The GAE domain is similar to the AP gamma-adaptin ear domain, being responsible for the recruitment of accessory proteins that regulate clathrin-mediated endocytosis.

    While clathrin mediates endocytic protein transport from ER to Golgi, coatomers (COPI, COPII) primarily mediate intra-Golgi transport, as well as the reverse Golgi to ER transport of dilysine-tagged proteins. Coatomers reversibly associate with Golgi (non-clathrin-coated) vesicles to mediate protein transport and for budding from Golgi membranes. Coatomer complexes are hetero-oligomers composed of at least an alpha, beta, beta', gamma, delta, epsilon and zeta subunits.

    The alpha and beta2 adaptor subunits can each be divided into a trunk domain and the appendage domain (or ear domain), separated by a linker region. Clathrin polymerisation is promoted by its binding to the beta2 appendage and hinge domains. The alpha appendage domain interacts with a number of accessory proteins, including eps15, epsin, amphiphysin, AP180, auxilin, numb, and Dab2, thereby regulating the translocation of these proteins to the bud site.

    This entry represents a subdomain of the appendage (ear) domain of alpha- and beta-adaptin from AP clathrin adaptor complexes, and the appendage domain of the gamma subunit of coatomer complexes. These domains have a three-layer arrangement, alpha-beta-alpha, with a bifurcated antiparallel beta-sheet. Although the appendage domains from AP adaptins and coatomers share a similar fold, there is little sequence identity between them. However, they also share similar motif-based cargo recognition and accessory factor recruitment mechanisms.

    More information about these proteins can be found at Protein of the Month: Clathrin.

    Proteins where this domain is known:
    PF11_0463    PFE1400c   


    SSF55724 - Mog1/PsbP/DUF1795_a/b/a-sand (Superfamily link)

    Interpro entry IPR016124 : (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of a 3-layer alpha/beta/alpha fold. The beta layer is composed of seven beta-sheets, and the overall order is: (beta-hairpin)-beta(3)-alpha-beta(4)-alpha. Domains with this structure are found in the following protein families:

    Proteins where this domain is known:
    MAL13P1.232   


    SSF55729 - Acyl_CoA_acyltransferase (Superfamily link)

    Interpro entry IPR016181 : (Interpro link)

    Interpro description:

    This entry represents a structural domain found in several acyl-CoA acyltransferase enzymes. This domain has a 3-layer alpha/beta/alpha structure that contains mixed beta-sheets, and can be found in the following proteins:

    Several proteins carry a duplication of this domain, which consists of two NAT-like domains swapped with the C-terminal strands, including:

    Proteins where this domain is known:
    MAL8P1.200    PF08_0034    PF08_0102    PF10_0036    PF11_0192    PF13_0131    PF14_0127    PF14_0350    PFA0465c    PFD0795w    PFF1405c    PFL1075w   


    SSF55753 - SSF55753 (Superfamily link)

    Proteins where this domain is known:
    PF13_0326    PFE0165w   


    SSF55770 - Profilin (Superfamily link)

    Interpro entry IPR002097 : Profilin/allergen (Interpro link)

    Interpro description:
    Profilin is a small eukaryotic protein that binds to monomeric actin (G-actin) in a 1:1 ratio thus preventing the polymerization of actin into filaments (F-actin). It can also in certain circumstance promote actin polymerization. Profilin also binds to polyphosphoinositides such as PIP2. Overall sequence similarity among profilin from organisms which belong to different phyla (ranging from fungi to mammals) is low, but the N-terminal region is relatively well conserved. That region is thought to be involved in the binding to actin.

    A protein structurally similar to profilin is present in the genome of Variola virus and Vaccinia virus (gene A42R).

    Some of the proteins in this family are allergens. Allergies are hypersensitivity reactions of the immune system to specific substances called allergens (such as pollen, stings, drugs, or food) that, in most people, result in no symptoms. A nomenclature system has been established for antigens (allergens) that cause IgE-mediated atopic allergies in humans [WHO/IUIS Allergen Nomenclature Subcommittee King T.P., Hoffmann D., Loewenstein H., Marsh D.G., Platts-Mills T.A.E., Thomas W. Bull. World Health Organ. 72:797-806(1994)]. This nomenclature system is defined by a designation that is composed of the first three letters of the genus; a space; the first letter of the species name; a space and an arabic number. In the event that two species names have identical designations, they are discriminated from one another by adding one or more letters (as necessary) to each species designation.

    The allergens in this family include allergens with the following designations: Ara t 8, Bet v 2, Cyn d 12, Hel a 2, Mer a 1 and Phl p 11.

    Proteins where this domain is known:
    PFI1565w   


    SSF55781 - SSF55781 (Superfamily link)

    Proteins where this domain is known:
    PFB0510w   


    SSF55811 - NUDIX_hydrolase (Superfamily link)

    Interpro entry IPR015797 : NUDIX (Interpro link)

    Interpro description:
    MutT is a small bacterial protein (~12-15Kd) involved in the GO system responsible for removing an oxidatively damaged form of guanine (8-hydroxy- guanine or 7,8-dihydro-8-oxoguanine) from DNA and the nucleotide pool. 8-oxo-dGTP is inserted opposite dA and dC residues of template DNA with near equal efficiency, leading to A-T to G-C transversions. MutT specifically degrades 8-oxo-dGTP to the monophosphate, with the concomitant release of pyrophosphate. A short conserved N-terminal region of mutT (designated the MutT domain) is also found in a variety of other prokaryotic, viral and eukaryotic proteins.

    The generic name 'NUDIX hydrolases' (NUcleoside DIphosphate linked to some other moiety X) has been coined for this domain family. The family can be divided into a number of subgroups, of which MutT anti- mutagenic activity represents only one type; most of the rest hydrolyse diverse nucleoside diphosphate derivatives (including ADP-ribose, GDP- mannose, TDP-glucose, NADH, UDP-sugars, dNTP and NTP).

    Proteins where this domain is known:
    MAL13P1.248    PF13_0048    PFE1035c   


    SSF55821 - DHBP_synth_RibB-like_a/b_dom (Superfamily link)

    Interpro entry IPR017945 : (Interpro link)

    Interpro description:

    This entry represents a structural domain consisting of segregated alpha and beta regions in 3-layers. Homologous domains with this structure are found in:

    DHBP synthase RibB catalyses the conversion of D-ribulose 5-phosphate to formate and 3,4-dihydroxy-2-butanone 4-phosphate, the latter serving as the biosynthetic precursor for the xylene ring of riboflavin. In Photobacterium leiognathi, the riboflavin synthesis genes ribB (DHBP synthase), ribE (riboflavin synthase), ribH (lumazone synthase) and ribA (GTP cyclohydrolase II) all reside in the lux operon. RibB is sometimes found as a bifunctional enzyme with GTP cyclohydrolase II that catalyses the first committed step in the biosynthesis of riboflavin. No sequences with significant homology to DHBP synthase are found in the metazoa.

    The YrdC family of hypothetical proteins are widely distributed in eukaryotes and prokaryotes and occur as: (i) independent proteins, (ii) with C-terminal extensions, and (iii) as domains in larger proteins, some of which are implicated in regulation. YrdC from Escherichia coli preferentially binds to double-stranded RNA and DNA. YrdC is predicted to be an rRNA maturation factor, as deletions in its gene lead to immature ribosomal 30S subunits and, consequently, fewer translating ribosomes. Therefore, YrdC may function by keeping an rRNA structure needed for proper processing of 16S rRNA, especially at lower temperatures. Sua5 is an example of a multi-domain protein that contains an N-terminal YrdC-like domain and a C-terminal Sua5 domain. Sua5 was identified in Saccharomyces cerevisiae (Baker's yeast) as a suppressor of a translation initiation defect in the cytochrome c gene and is required for normal growth in yeast; however its exact function remains unknown. HypF is involved in the synthesis of the active site of [NiFe]-hydrogenases.

    Proteins where this domain is known:
    PFL0175c   


    SSF55826 - YbaK/aa-tRNA-synth-assoc-reg (Superfamily link)

    Interpro entry IPR007214 : (Interpro link)

    Interpro description:
    This domain of unknown function is found in numerous prokaryote organisms. The structure of YbaK shows a novel fold. This domain also occurs in a number of prolyl-tRNA synthetases (proRS) from prokaryotes. Thus, the domain is thought to be involved in oligonucleotide binding, with possible roles in recognition/discrimination or editing of prolyl-tRNA.

    Proteins where this domain is known:
    PFL0670c   


    SSF55831 - Thymidylat_synth_C (Superfamily link)

    Interpro entry IPR000398 : Thymidylate synthase, C-terminal (Interpro link)

    Interpro description:
    Thymidylate synthase catalyzes the reductive methylation of dUMP to dTMP with concomitant conversion of 5,10-methylenetetrahydrofolate to dihydrofolate:
     5,10-methylenetetrahydrofolate + dUMP = dihydrofolate + dTMP 
    This provides the sole de novo pathway for production of dTMP and is the only enzyme in folate metabolism in which the 5,10-methylenetetrahydrofolate is oxidised during one-carbon transfer. The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication: defects in the enzyme activity affecting the regulation process cause various biological and genetic abnormalities, such as thymineless death. The enzyme is an important target for certain chemotherapeutic drugs. Thymidylate synthase is an enzyme of about 30 to 35 Kd in most species except in protozoan and plants where it exists as a bifunctional enzyme that includes a dihydrofolate reductase domain. A cysteine residue is involved in the catalytic mechanism (it covalently binds the 5,6-dihydro-dUMP intermediate). The sequence around the active site of this enzyme is conserved from phages to vertebrates.

    Proteins where this domain is known:
    PFD0830w   


    SSF55856 - Cyt_B5 (Superfamily link)

    Interpro entry IPR001199 : Cytochrome b5 (Interpro link)

    Interpro description:
    Cytochromes b5 are ubiquitous electron transport proteins found in animals, plants and yeasts. The microsomal and mitochondrial variants are membrane-bound, while those from erythrocytes and other animal tissues are water-soluble.

    The 3D structure of bovine cyt b5 is known, the fold belonging to the alpha+beta class, with 5 strands and 5 short helices forming a framework for supporting a central haem group. The cytochrome b5 domain is similar to that of a number of oxidoreductases, such as plant and fungal nitrate reductases, sulphite oxidase, yeast flavocytochrome b2 (L-lactate dehydrogenase) and plant cyt b5/acyl lipid desaturase fusion protein.

    Proteins where this domain is known:
    PF14_0266    PFI0885w    PFL1555w   


    SSF55874 - ATP_bd_ATPase (Superfamily link)

    Interpro entry IPR003594 : ATP-binding region, ATPase-like (Interpro link)

    Interpro description:

    This domain is found in several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    MAL13P1.328    MAL7P1.145    PF07_0029    PF11_0184    PF11_0188    PF14_0316    PF14_0417    PF14_0649    PFL1070c    PFL1915w   


    SSF55904 - Decarbxylse_C (Superfamily link)

    Interpro entry IPR008286 : Orn/Lys/Arg decarboxylase, C-terminal (Interpro link)

    Interpro description:
    Pyridoxal-dependent decarboxylases are bacterial proteins acting on ornithine, lysine, arginine and related substrates. One of the regions of sequence similarity contains a conserved lysine residue, which is the site of attachment of the pyridoxal-phosphate group.

    Proteins where this domain is known:
    PFD0285c   


    SSF55909 - SSF55909 (Superfamily link)

    Proteins where this domain is known:
    PF13_0178   


    SSF55920 - Peptidase_M24_cat_core (Superfamily link)

    Interpro entry IPR000994 : Peptidase M24, structural domain (Interpro link)

    Interpro description:

    Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases.

    In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

    In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.

    This entry contains proteins that belong to MEROPS peptidase family M24 (clan MG), which share a common structural-fold, the "pita-bread" fold. The fold contains both alpha helices and an anti-parallel beta sheet within two structurally similar domains that are thought to be derived from an ancient gene duplication. The active site, where conserved, is located between the two domains. The fold is common to methionine aminopeptidase, aminopeptidase P, prolidase, agropine synthase and creatinase . Though many of these peptidases require a divalent cation, creatinase is not a metal-dependent enzyme.

    The entry also contains proteins that have lost catalytic activity, for example Spt16 , which is a component of the FACT complex. The crystal structure of the N terminal domain of Spt16, determined to 2.1A, reveals an aminopeptidase P fold whose enzymatic activity has been lost. This fold binds directly to histones H3-H4 through a interaction with their globular core domains, as well as with their N-terminal tails.

    The FACT complex is a stable heterodimer in Saccharomyces cerevisiae (Baker's yeast) comprising Spt16p ( ) and Pob3p (). The complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin; it also facilitates RNA Polymerase II transcription elongation through nucleosomes by destabilizing and then reassembling nucleosome structure.

    Proteins where this domain is known:
    MAL8P1.140    PF10_0150    PF14_0261    PF14_0327    PF14_0517    PFE0870w    PFE1360c   


    SSF55931 - SSF55931 (Superfamily link)

    Proteins where this domain is known:
    PFI0925w    PFI1110w   


    SSF55945 - TFIID_C/glycos_N (Superfamily link)

    Interpro entry IPR012294 : Transcription factor TFIID, C-terminal/DNA glycosylase, N-terminal (Interpro link)

    Interpro description:

    Transcription factor TFIID (also known as TATA-binding protein, TBP) is a general factor that plays a central role in the activation of eukaryotic genes transcribed by RNA polymerase II. TFIID binds specifically to the TATA-box promoter element, which lies close to the position of transcription initiation. The C-terminal domain (~180 residues) of eukaryotic TFIID sequences is highly conserved and is involved in TATA-box binding. The most striking feature of the domain is the presence of 2 conserved 77 amino-acid repeats. The symmetrical disposition of these features generates a saddle-shaped structure that straddles the DNA.

    DNA glycosylases are involved in the repair of damaged bases in DNA, acting to cleave the bond between the damaged, modified base and the deoxyribose sugar backbone of the DNA. These DNA repair activities are conserved from bacteria to man. Different DNA glycosylases can have different overall folds, even though many of them work by a common mechanism, involving bending the DNA and clamping on to the damaged base to excise it. This entry is represented by 3-methyladenine DNA glycosylase II (AlkA) from Escherichia coli and human 8-oxoguanine glycosylase, whose N-terminal domains display a beta-alpha-beta(4)-alpha fold similar to that found in the C-terminal domain of TFIID. However, unlike TFIID, which contains a duplication of this fold, these DNA glycosylases carry only a single copy of this fold.

    Proteins where this domain is known:
    PF14_0267    PFE0305w    PFI0835c   


    SSF55957 - SSF55957 (Superfamily link)

    Proteins where this domain is known:
    PF10_0122    PF11_0311   


    SSF55961 - SSF55961 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.256    PF14_0604    PFA0210c    PFC0360w    PFI0540w   


    SSF55973 - S-AdoMet_synt (Superfamily link)

    Interpro entry IPR002133 : S-adenosylmethionine synthetase (Interpro link)

    Interpro description:

    S-adenosylmethionine synthetase (MAT) is the enzyme that catalyzes the formation of S-adenosylmethionine (AdoMet) from methionine and ATP. AdoMet is an important methyl donor for transmethylation and is also the propylamino donor in polyamine biosynthesis.

    In bacteria there is a single isoform of AdoMet synthetase (gene metK), there are two in budding yeast (genes SAM1 and SAM2) and in mammals while in plants there is generally a multigene family.

    The sequence of AdoMet synthetase is highly conserved throughout isozymes and species. The active sites of both the Escherichia coli and rat liver MAT reside between two subunits, with contributions from side chains of residues from both subunits, resulting in a dimer as the minimal catalytic entity. The side chains that contribute to the ligand binding sites are conserved between the two proteins. In the structures of complexes with the E. coli enzyme, the phosphate groups have the same positions in the (PPi plus Pi) complex and the (ADP plus Pi) complex, and are located at the bottom of a deep cavity with the adenosyl group nearer the entrance.

    Proteins where this domain is known:
    PFI1090w   


    SSF55979 - DNA clamp (Superfamily link)

    Proteins where this domain is known:
    PF13_0328    PF14_0582    PFL1285c   


    SSF56014 - Sulfite reductase hemoprotein (SiRHP), domains 2 and 4 (Superfamily link)

    Proteins where this domain is known:
    PF10_0221   


    SSF56019 - HORMA_DNA_bd (Superfamily link)

    Interpro entry IPR003511 : DNA-binding HORMA (Interpro link)

    Interpro description:
    The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. Hop1 is a meiosis-specific protein, Rev7 is required for DNA damage induced mutagenesis, and MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity.

    Proteins where this domain is known:
    PF10_0227    PF13_0050   


    SSF56024 - SSF56024 (Superfamily link)

    Proteins where this domain is known:
    MAL8P1.58    PFF0465c   


    SSF56037 - B3_4 (Superfamily link)

    Interpro entry IPR005146 : B3/B4 tRNA-binding domain (Interpro link)

    Interpro description:

    This entry represents the B3/B4 domain found in tRNA synthetase beta subunits as well as in some non-tRNA synthetase proteins. This domain has a 3-layer structure, and contains a beta-sandwich fold of unusual topology, and contains a putative tRNA-binding structural motif. In Thermus thermophilus, both the catalytic alpha- and the non-catalytic beta-subunits comprise the characteristic fold of the class II active-site domains. The presence of an RNA-binding domain, similar to that of the U1A spliceosomal protein, in the beta-subunit of tRNA synthetase indicates structural relationships among different families of RNA-binding proteins.

    Aminoacyl-tRNA synthetases can catalyse editing reactions to correct errors produced during amino acid activation and tRNA esterification, in order to prevent the attachment of incorrect amino acids to tRNA. The B3/B4 domain of the beta subunit contains an editing site, which lies close to the active site on the alpha subunit. Disruption of this site abolished tRNA editing, a process that is essential for faithful translation of the genetic code.

    Proteins where this domain is known:
    PF11_0051   


    SSF56042 - AIR_synth_C (Superfamily link)

    Interpro entry IPR010918 : (Interpro link)

    Interpro description:

    This entry includes Hydrogen expression/formation protein, HypE, which may be involved in the maturation of NifE hydrogenase; AIR synthase and FGAM synthase, which are involved in de novo purine biosynthesis; and selenide, water dikinase, an enzyme which synthesizes selenophosphate from selenide and ATP.

    Proteins where this domain is known:
    PFI0505c   


    SSF56047 - Ribosomal_S8 (Superfamily link)

    Interpro entry IPR000630 : Ribosomal protein S8 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    Ribosomal protein S8 is one of the proteins from the small ribosomal subunit. In Escherichia coli, S8 is known to bind directly to 16S ribosomal RNA. It belongs to a family of ribosomal proteins which, on the basis of sequence similarities, groups eubacterial, algal and plant chloroplast, cyanelle, archaebacterial and Marchantia polymorpha mitochondrial S8; mammalian and plant S15A; and yeast S22 (S24) ribosomal proteins.

    Proteins where this domain is known:
    MAL7P1.93    PFC0735w   


    SSF56053 - Ribosomal_L6 (Superfamily link)

    Interpro entry IPR000702 : Ribosomal protein L6 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the process of protein synthesis. In addition to their function in the ribosome, many ribosomal proteins have some function 'outside' the ribosome.

    L6 is a protein from the large (50S) subunit. In Escherichia coli, it is located in the aminoacyl-tRNA binding site of the peptidyltransferase centre, and is known to bind directly to 23S rRNA. It belongs to a family of ribosomal proteins, including L6 from bacteria, cyanelles (structures that perform similar functions to chloroplasts, but have structural and biochemical characteristics of Cyanobacteria) and mitochondria; and L9 from mammals, Drosophila, plants and yeast. L6 comprises 2 almost identical folds, suggesting that is was derived by the duplication of an ancient RNA-binding protein gene. Analysis reveals several sites on the protein surface where interactions with other ribosome components may occur, the N-terminus being involved in protein-protein interactions and the C-terminus containing possible RNA-binding sites.

    Proteins where this domain is known:
    PF13_0129   


    SSF56059 - SSF56059 (Superfamily link)

    Proteins where this domain is known:
    PF10_0094    PF11_0481    PF13_0044    PF14_0282    PF14_0295    PF14_0357    PF14_0664    PFE0605c    PFE0700c   


    SSF56091 - SSF56091 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.22    PF14_0144   


    SSF56104 - SSF56104 (Superfamily link)

    Proteins where this domain is known:
    PF10_0078    PF13_0089    PF14_0123    PFA0515w    PFE0740c   


    SSF56112 - Kinase_like (Superfamily link)

    Interpro entry IPR011009 : (Interpro link)

    Interpro description:

    Protein kinases catalyze the phosphotransfer reaction fundamental to most signalling and regulatory processes in the eukaryotic cell. The catalytic subunit contains a core that is common to both serine/threonine and tyrosine protein kinases. The catalytic domain contains the nucleotide-binding site and the catalytic apparatus in an inter-lobe cleft. Structurally it shares functional and structural similarities with the ATP-grasp fold, which is found in enzymes that catalyse the formation of an amide bond, and with PIPK (phosphoinositol phosphate kinase). The three-dimensional fold of the protein kinase catalytic domain is similar to domains found in several other proteins. These include the catalytic domain of actin-fragmin kinase, an atypical protein kinase that regulates the F-actin capping activity in plasmodia; the catalytic domain of phosphoinositide-3-kinase (PI3K), which phosphorylates phosphoinositides and as such is involved in a number of fundamental cellular processes such as apoptosis, proliferation, motility and adhesion; the catalytic domain of the MHCK/EF2 kinase, an atypical protein kinase that includes the TRP (transient channel potential) calcium-channel kinase involved in the modulation of calcium channels in eukaryotic cells in response to external signals; choline kinase, which catalyses the ATP-dependent phosphorylation of choline during the biosynthesis of phosphatidylcholine; and 3',5'-aminoglycoside phosphotransferase type IIIa, a bacterial enzyme that confers resistance to a range of aminoglycoside antibiotics.

    Proteins where this domain is known:
    MAL13P1.114    MAL13P1.185    MAL13P1.196    MAL13P1.278    MAL13P1.279    MAL13P1.84    MAL7P1.100    MAL7P1.127    MAL7P1.132    MAL7P1.144    MAL7P1.175    MAL7P1.18    MAL7P1.26    MAL7P1.73    MAL7P1.91    MAL8P1.203    MAL8P1.42    PF07_0072    PF08_0044    PF08_0098    PF10_0141    PF10_0160    PF10_0380    PF11_0060    PF11_0079    PF11_0096    PF11_0147    PF11_0156    PF11_0220    PF11_0227    PF11_0239    PF11_0242    PF11_0257    PF11_0377    PF11_0464    PF11_0488    PF11_0510    PF13_0085    PF13_0166    PF13_0211    PF13_0258    PF14_0020    PF14_0143    PF14_0227    PF14_0264    PF14_0294    PF14_0320    PF14_0346    PF14_0392    PF14_0408    PF14_0423    PF14_0431    PF14_0476    PF14_0516    PF14_0715    PF14_0734    PFA0130c    PFA0380w    PFB0150c    PFB0520w    PFB0605w    PFB0665w    PFB0815w    PFC0060c    PFC0105w    PFC0385c    PFC0420w    PFC0485w    PFC0525c    PFC0755c    PFC0945w    PFD0740w    PFD0865c    PFD0965W    PFD0975w    PFD1165w    PFD1175w    PFE0045c    PFE0485w    PFE0765w    PFE1290w    PFF0260w    PFF0520w    PFF0750w    PFF1145c    PFF1370w    PFI0095c    PFI0100c    PFI0105c    PFI0110c    PFI0115c    PFI0120c    PFI0125c    PFI1275w    PFI1280c    PFI1290w    PFI1415w    PFI1685w    PFL0040c    PFL0080c    PFL1370w    PFL1490w    PFL1885c    PFL2250c    PFL2280w   


    SSF56204 - HECT (Superfamily link)

    Interpro entry IPR000569 : HECT (Interpro link)

    Interpro description:

    The name HECT comes from 'Homologous to the E6-AP Carboxyl Terminus'. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase, which regulates ubiquitination of CDC25. Ubiquitin-protein ligase accepts ubiquitin from an E2 ubiquitin-conjugating enzyme in the form of a thioester, and then directly transfers the ubiquitin to targeted substrates. A cysteine residue is required for ubiquitin-thiolester formation. Human thyroid receptor interacting protein 12, which also contains this domain, is a component of an ATP-dependent multisubunit protein that interacts with the ligand binding domain of the thyroid hormone receptor. It could be an E3 ubiquitin-protein ligase. Human ubiquitin-protein ligase E3A interacts with the E6 protein of the cancer-associated Human papillomavirus type 16 and Human papillomavirus type 18. The E6/E6-AP complex binds to and targets the P53 tumour-suppressor protein for ubiquitin-mediated proteolysis.

    Proteins where this domain is known:
    MAL7P1.19    MAL8P1.23    PF11_0201    PFF1365c   


    SSF56214 - 4-PPT_transf (Superfamily link)

    Interpro entry IPR008278 : 4'-phosphopantetheinyl transferase (Interpro link)

    Interpro description:

    These proteins transfer the 4'-phosphopantetheine (4'-PP) moiety from coenzyme A (CoA) to the invariant serine of pp-binding. This post-translational modification renders holo-ACP capable of acyl group activation via thioesterification of the cysteamine thiol of 4'-PP. This superfamily consists of two subtypes: The ACPS type such as ACPS_ECOLI and the Sfp type such as SFP_BACSU. The structure of the Sfp type is known, which shows the active site accommodates a magnesium ion. The most highly conserved regions of the alignment are involved in binding the magnesium ion.

    Proteins where this domain is known:
    PFD0980w   


    SSF56219 - Exo_endo_phos (Superfamily link)

    Interpro entry IPR005135 : (Interpro link)

    Interpro description:

    This domain is found in a large number of proteins including magnesium dependent endonucleases and phosphatases involved in intracellular signalling. Proteins this domain is found in include: AP endonuclease proteins, DNase I proteins, Synaptojanin an inositol-1,4,5-trisphosphate phosphatase and Sphingomyelinase.

    Proteins where this domain is known:
    PF07_0024    PF11_0122    PF13_0336    PF14_0285    PFA0350w    PFC0250c    PFC0850c    PFE0980c    PFL1870c   


    SSF56235 - SSF56235 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.270    MAL8P1.128    MAL8P1.142    PF07_0112    PF10_0111    PF10_0245    PF13_0156    PF13_0282    PF14_0334    PF14_0676    PF14_0716    PFA0400c    PFC0395w    PFC0745c    PFE0915c    PFF0420c    PFI1545c    PFL1465c   


    SSF56276 - S-AdenosylMet_decarbase_core (Superfamily link)

    Interpro entry IPR016067 : S-adenosylmethionine decarboxylase, core (Interpro link)

    Interpro description:

    S-adenosylmethionine decarboxylase (AdoMetDC) catalyzes the removal of the carboxylate group of S-adenosylmethionine to form S-adenosyl-5'-3-methylpropylamine which then acts as the n-propylamine group donor in the synthesis of the polyamines spermidine and spermine from putrescine.

    The catalytic mechanism of AdoMetDC involves a covalently-bound pyruvoyl group. This group is post-translationally generated by a self-catalyzed intramolecular proteolytic cleavage reaction between a glutamate and a serine. This cleavage generates two chains, beta (N-terminal) and alpha (C-terminal). The N-terminal serine residue of the alpha chain is then converted by nonhydrolytic serinolysis into a pyruvyol group.

    Proteins where this domain is known:
    PF10_0322   


    SSF56281 - SSF56281 (Superfamily link)

    Proteins where this domain is known:
    PF07_0100    PF11_0452    PF14_0161    PF14_0364    PF14_0620    PF14_0711    PFC0825c    PFD0311w    PFL0285w    PFL1810w   


    SSF56300 - SSF56300 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.274    PF08_0129    PF10_0177    PF13_0222    PF14_0036    PF14_0064    PF14_0142    PF14_0224    PF14_0614    PF14_0630    PF14_0660    PFA0390w    PFC0595c    PFI0605c    PFI0880c    PFI1245c    PFI1360c    PFL0300c   


    SSF56322 - TRPE_1_chor_bd (Superfamily link)

    Interpro entry IPR005801 : Anthranilate synthase component I and chorismate binding protein (Interpro link)

    Interpro description:
    This entry represents the catalytic regions of the chorismate binding enzymes anthranilate synthase, isochorismate synthase, aminodeoxychorismate synthase and para-aminobenzoate synthase. Anthranilate synthase catalyses the reaction:
     chorismate + l-glutamine =  anthranilate + pyruvate + l-glutamate. 
    The enzyme is a tetramer comprising 2 I and 2 II components: this entry is restricted to component I that catalyses the formation of anthranilate using ammonia rather than glutamine, while component II provides glutamine amidotransferase activity

    Proteins where this domain is known:
    PFI1100w   


    SSF56327 - Lactate_DH/Glyco_hydro_4_C (Superfamily link)

    Interpro entry IPR015955 : Lactate dehydrogenase/glycoside hydrolase, family 4, C-terminal (Interpro link)

    Interpro description:

    This entry represents a structural motif found at the C-terminal of lactate dehydrogenaseand malate dehydrogenases, as well as at the C-terminal of family 4 glycoside hydrolases. These domains have an unusual fold consisting of segregated alpha-helical and beta-sheet regions, although they contain predominantly anti-parallel beta-sheets.

    L-lactate dehydrogenases are metabolic enzymes that catalyse the conversion of L-lactate to pyruvate, the last step in anaerobic glycolysis. L-lactate dehydrogenase is also found as a lens crystallin in bird and crocodile eyes. Malate dehydrogenases catalyse the interconversion of malate to oxaloacetate. The enzyme participates in the citric acid cycle.

    O-Glycosyl hydrolasesare a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A classification system for glycosyl hydrolases, based on sequence similarity, has led to the definition of 85 different families. This classification is available on the CAZy (CArbohydrate-Active EnZymes) web site. Because the fold of proteins is better conserved than their sequences, some of the families can be grouped in 'clans'. Glycoside hydrolase family 4comprises enzymes with several known activities; 6-phospho-beta-glucosidase; 6-phospho-alpha-glucosidase; alpha-galactosidase.

    Proteins where this domain is known:
    PF13_0141    PF13_0144    PFF0895w   


    SSF56349 - DNA_brk_join_enz (Superfamily link)

    Interpro entry IPR011010 : DNA breaking-rejoining enzyme, catalytic core (Interpro link)

    Interpro description:

    Phage integrases are enzymes that mediate unidirectional site-specific recombination between two DNA recognition sequences, the phage attachment site, attP, and the bacterial attachment site, attB. Integrases may be grouped into two major families, the tyrosine recombinases and the serine recombinases, based on their mode of catalysis. Tyrosine family integrases, such as Bacteriophage lambda integrase, utilise a catalytic tyrosine to mediate strand cleavage, tend to recognize longer attP sequences, and require other proteins encoded by the phage or the host bacteria.

    The 356 amino acid lambda integrase consists of two domains: an N-terminal domain that includes residues 1-64 and is responsible for binding the arm-type sites of attP, and a C-terminal domain (CTD) that binds the lower affinity core-type sites and contains the catalytic site. The CTD can be further divided into the core-type binding domain (residues 65-169) and the catalytic core domain (170-356), the later representing this entry. The catalytic core adopts an alpha3-beta3-alpha4 fold, where one side of the beta sheet is exposed.

    The recombinases Cre from phage P1, XerD from Escherichia coli and Flp from yeast are members of the tyrosine recombinase family, and have a two-domain motif resembling that of lambda integrase, as well as sharing a conserved binding mechanism. The structural fold of their catalytic core domains resemble that of Lambda integrase

    The catalytic core of the eukaryotic DNA topoisomerase I shares significant structural similarity with the bacteriophage family of DNA integrases. Topoisomerases I promote the relaxation of DNA superhelical tension by introducing a transient single-stranded break in duplex DNA and are vital for the processes of replication, transcription, and recombination.

    Proteins where this domain is known:
    MAL13P1.42    PFE0520c   


    SSF56399 - SSF56399 (Superfamily link)

    Proteins where this domain is known:
    PFE0895c   


    SSF56420 - Fmet_deformylase (Superfamily link)

    Interpro entry IPR000181 : Formylmethionine deformylase (Interpro link)

    Interpro description:

    Peptide deformylase (PDF) is an essential metalloenzyme required for the removal of the formyl group at the N-terminus of nascent polypeptide chains in eubacteria The enzyme acts as a monomer and binds a single zinc ion, catalysing the reaction::

     N-formyl-L-methionine + H2O = formate + methionyl peptide 
    Catalytic efficiency strongly depends on the identity of the bound metal.

    The structure of these enzymes is known. PDF, a member of the zinc metalloproteases family, comprises an active core domain of 147 residues and a C-terminal tail of 21 residue. The 3D fold of the catalytic core has been determined by X-ray crystallography and NMR. Overall, the structure contains a series of anti-parallel beta- strands that surround two perpendicular alpha-helices. The C-terminal helix contains the characteristic HEXXH motif of metalloenzymes, which is crucial for activity. The helical arrangement, and the way the histidine residues bind the zinc ion, is reminiscent of other metalloproteases, such as thermolysin or metzincins. However, the arrangement of secondary and tertiary structures of PDF, and the positioning of its third zinc ligand (a cysteine residue), are quite different. These discrepancies, together with notable biochemical differences, suggest that PDF constitutes a new class of zinc-metalloproteases. .

    Proteins where this domain is known:
    PFI0380c   


    SSF56425 - SSF56425 (Superfamily link)

    Proteins where this domain is known:
    PF10_0334   


    SSF56487 - Srcr_receptor (Superfamily link)

    Interpro entry IPR017448 : Speract/scavenger receptor related (Interpro link)

    Interpro description:

    The egg peptide speract receptor is a transmembrane glycoprotein. Other members of this family include the macrophage scavenger receptor type I (a membrane glycoprotein implicated in the pathologic deposition of cholesterol in arterial walls during artherogenesis), an enteropeptidase and T-cell surface glycoprotein CD5 (may act as a receptor in regulating T-cell proliferation).

    Proteins where this domain is known:
    PF14_0067   


    SSF56496 - Fibrinogen_a/b/g_C (Superfamily link)

    Interpro entry IPR002181 : Fibrinogen, alpha/beta/gamma chain, C-terminal globular (Interpro link)

    Interpro description:

    Fibrinogen plays key roles in both blood clotting and platelet aggregation. During blood clot formation, the conversion of soluble fibrinogen to insoluble fibrin is triggered by thrombin, resulting in the polymerisation of fibrin, which forms a soft clot; this is then converted to a hard clot by factor XIIIA, which cross-links fibrin molecules. Platelet aggregation involves the binding of the platelet protein receptor integrin alpha(IIb)-beta(3) to the C-terminal D domain of fibrinogen. In addition to platelet aggregation, platelet-fibrinogen interaction mediates both adhesion and fibrin clot retraction.

    Fibrinogen occurs as a dimer, where each monomer is composed of three non-identical chains, alpha, beta and gamma, linked together by several disulphide bonds. The N-terminals of all six chains come together to form the centre of the molecule (E domain), from which the monomers extend in opposite directions as coiled coils, followed by C-terminal globular domains (D domains). Therefore, the domain composition is: D-coil-E-coil-D. At each end, the C-terminal of the alpha chain extends beyond the D domain as a protuberance that is important for cross-linking the molecule.

    During clot formation, the N-terminal fragments of the alpha and beta chains (within the E domain) in fibrinogen are cleaved by thrombin, releasing fibrinopeptides A and B, respectively, and producing fibrin. This cleavage results in the exposure of four binding sites on the E domain, each of which can bind to a D domain from different fibrin molecules. The binding of fibrin molecules produces a polymer consisting of a lattice network of fibrins that form a long, branching, flexible fibre. Fibrin fibres interact with platelets to increase the size of the clot, as well as with several different proteins and cells, thereby promoting the inflammatory response and concentrating the cells required for wound repair at the site of damage.

    This entry represents the C-terminal globular D domain of the alpha, beta and gamma chains. These domains are related to domains in other proteins: in the Parastichopus parvimensis (Sea cucumber) fibrogen-like FreP-A and FreP-B proteins; in the C-terminus of the Drosophila scabrous protein that is involved in the regulation of neurogenesis, possibly through the inhibition of R8 cell differentiation; and in ficolin proteins, which display lectin activity towards N-acetylglucosamine through their fibrogen-like domains.

    More information about these proteins can be found at Protein of the Month: Fibrinogen.

    Proteins where this domain is known:
    PF14_0532    PF14_0723   


    SSF56553 - RNAP_insert (Superfamily link)

    Interpro entry IPR011262 : DNA-directed RNA polymerase, insert (Interpro link)

    Interpro description:

    DNA-directed RNA polymerases(also known as DNA-dependent RNA polymerases) are responsible for the polymerisation of ribonucleotides into a sequence complementary to the template DNA. In eukaryotes, there are three different forms of DNA-directed RNA polymerases transcribing different sets of genes. Most RNA polymerases are multimeric enzymes and are composed of a variable number of subunits. The core RNA polymerase complex consists of five subunits (two alpha, one beta, one beta-prime and one omega) and is sufficient for transcription elongation and termination but is unable to initiate transcription. Transcription initiation from promoter elements requires a sixth, dissociable subunit called a sigma factor, which reversibly associates with the core RNA polymerase complex to form a holoenzyme. The core RNA polymerase complex forms a "crab claw"-like structure with an internal channel running along the full length. The key functional sites of the enzyme, as defined by mutational and cross-linking analysis, are located on the inner wall of this channel.

    RNA synthesis follows after the attachment of RNA polymerase to a specific site, the promoter, on the template DNA strand. The RNA synthesis process continues until a termination sequence is reached. The RNA product, which is synthesised in the 5' to 3'direction, is known as the primary transcript. Eukaryotic nuclei contain three distinct types of RNA polymerases that differ in the RNA they synthesise:

    Eukaryotic cells are also known to contain separate mitochondrial and chloroplast RNA polymerases. Eukaryotic RNA polymerases, whose molecular masses vary in size from 500 to 700 kD, contain two non-identical large (>100 kDa) subunits and an array of up to 12 different small (less than 50 kDa) subunits.

    RNA polymerase (RNAP) II, which is responsible for all mRNA synthesis in eukaryotes, consists of 12 subunits. Subunits Rpb3 and Rpb11 form a heterodimer that is functionally analogous to the archaeal RNAP D/L heterodimer, and to the prokaryotic RNAP alpha (RpoA) subunit homodimer. In each case, they play a key role in RNAP assembly by forming a platform on which the catalytic subunits (eukaryotic Rpb1/Rpb2, and prokaryotic beta/beta') can interact.

    The dimerisation domains differ between the different subunit families. In eukaryotic Rpb3, archaeal D and bacterial RpoA subunits, the dimerisation domain is comprised of a central insert domain, which interrupts an Rpb11-like domain, dividing it into two halves. In eukaryotic Rpb11 and archaeal L subunits, the insert domain is lacking, leaving the Rpb11-like domain intact and contiguous.

    Proteins where this domain is known:
    PF11_0445    PF13_0040    PF14_0695    PFI1130c   


    SSF56601 - PBP_transp_fold (Superfamily link)

    Interpro entry IPR012338 : (Interpro link)

    Interpro description:

    This entry represents a beta-lactamase structural motif, which contins a cluster of alpha-helices and an alpha/beta sandwich. In addition to beta-lactamases, this domain is also found in D-ala carboxypeptidase/transpeptidase, esterase (EstB), the penicillin receptor BlaR (C-terminal domain), D-aminopeptidase (N-terminal domain), penicillin-biding proteins (e.g. PBP2x, PBP5), and in glutaminase (GlnA). Beta-lactamases are the most common bacterial resistance mechanism against beta-lactam antibiotics. Beta-lactamases appear to have evolved from DD-transpeptidases, which are penicillin-binding proteins involved in cell wall biosynthesis, and as such are one of the main targets of beta-lactam antibiotics.

    Proteins where this domain is known:
    PF14_0143   


    SSF56672 - SSF56672 (Superfamily link)

    Proteins where this domain is known:
    PF10_0165    PF10_0362    PF11_0264    PF14_0112    PFD0590c    PFF1225c    PFF1470c    PFI0510c   


    SSF56712 - Topo_IA_core (Superfamily link)

    Interpro entry IPR000380 : DNA topoisomerase, type IA, core (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry describes the core region of type IA topoisomerases, which are highly conserved enzymes that are structurally distinct from type IB enzymes. The structures of both topoisomerases I and III have been elucidated, and consist of four domains that together form a toroidal molecule with a central hole that is large enough to accommodate single- and double-stranded DNA. It is believed that the domains transiently separate from one another to allow the entrance and exit of DNA strands.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF13_0251   


    SSF56719 - Topo_IIA_cen (Superfamily link)

    Interpro entry IPR013760 : DNA topoisomerase, type IIA, central (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type II topoisomerases are ATP-dependent enzymes, and can be subdivided according to their structure and reaction mechanisms: type IIA (topoisomerase II or gyrase, and topoisomerase IV) and type IIB (topoisomerase VI). These enzymes are responsible for relaxing supercoiled DNA as well as for introducing both negative and positive supercoils.

    Type IIA topoisomerases together manage chromosome integrity and topology in cells. Topoisomerase II (called gyrase in bacteria) primarily introduces negative supercoils into DNA. In bacteria, topoisomerase II consists of two polypeptide subunits, gyrA and gyrB, which form a heterotetramer: (BA)2. In most eukaryotes, topoisomerase II consists of a single polypeptide, where the N- and C-terminal regions correspond to gyrB and gyrA, respectively; this topoisomerase II forms a homodimer that is equivalent to the bacterial heterotetramer. There are four functional domains in topoisomerase II: domain 1 (N-terminal of gyrB) is an ATPase, domain 2 (C-terminal of gyrB) is responsible for subunit interactions (differs between eukaryotic and bacterial enzymes), domain 3 (N-terminal of gyrA) is responsible for the breaking-rejoining function through its capacity to form protein-DNA bridges, and domain 4 (C-terminal of gyrA) is able to non-specifically bind DNA.

    Topoisomerase IV primarily decatenates DNA and relaxes positive supercoils, which is important in bacteria, where the circular chromosome becomes catenated, or linked, during replication. Topoisomerase IV consists of two polypeptide subunits, parE and parC, where parC is homologous to gyrA and parE is homologous to gyrB.

    This entry represents the C-terminal of subunit B (gyrB and parE) and the N-terminal of subunit A (gyrA and parC) of bacterial gyrase and topoisomerase IV, and the equivalent central region in eukaryotic topoisomerase II composed of a single polypeptide.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF14_0316    PFL1120c    PFL1915w   


    SSF56726 - Spo11/TopoVI_A (Superfamily link)

    Interpro entry IPR002815 : Spo11/DNA topoisomerase VI, subunit A (Interpro link)

    Interpro description:

    This entry represents Spo11, a meiotic recombination protein found in eukaryotes, and subunit A of topoisomerase VI, a type IIB topoisomerase found predominantly in archaea. These two types of proteins share structural homology.

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. They can be divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA. Topoisomerase VI is a type IIB enzymes that assembles as a heterotetramer, consisting of two A subunits required for DNA cleavage and two B subunits required for ATP hydrolysis. The B subunit is structurally similar to the ATPase domain of type IIA topoisomerases, but the A subunit is distinct, and instead shares homology with the Spo11 protein.

    Spo11 is a meiosis-specific protein that is responsible for the initiation of recombination through the formation of DNA double-strand breaks by a type II DNA topoisomerase-like activity. Spo11 acts in conjunction with several other proteins, including Rec102 in yeast, to bring about meiotic recombination.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PF10_0412    PFL0825c   


    SSF56731 - SSF56731 (Superfamily link)

    Proteins where this domain is known:
    PF14_0112   


    SSF56741 - TopoI_DNA_bd_euk (Superfamily link)

    Interpro entry IPR008336 : DNA topoisomerase I, DNA binding, eukaryotic-type (Interpro link)

    Interpro description:

    DNA topoisomerases regulate the number of topological links between two DNA strands (i.e. change the number of superhelical turns) by catalysing transient single- or double-strand breaks, crossing the strands through one another, then resealing the breaks. These enzymes have several functions: to remove DNA supercoils during transcription and DNA replication; for strand breakage during recombination; for chromosome condensation; and to disentangle intertwined DNA during mitosis. DNA topoisomerases are divided into two classes: type I enzymes (topoisomerases I, III and V) break single-strand DNA, and type II enzymes (topoisomerases II, IV and VI) break double-strand DNA.

    Type I topoisomerases are ATP-independent enzymes (except for reverse gyrase), and can be subdivided according to their structure and reaction mechanisms: type IA (bacterial and archaeal topoisomerase I, topoisomerase III and reverse gyrase) and type IB (eukaryotic topoisomerase I and topoisomerase V). These enzymes are primarily responsible for relaxing positively and/or negatively supercoiled DNA, except for reverse gyrase, which can introduce positive supercoils into DNA.

    This entry represents the N-terminal DNA-binding domain found in eukaryotic topoisomerase I, which is a type IB enzymes. To cleave the DNA backbone, these enzymes must make a transient phosphotyrosine bond. The N-terminal domain of human topoisomerase I is thought to coordinate the restriction of free strand rotation during the topoisomerisation step of catalysis. A conserved tryptophan residue may be important for the DNA-interaction ability of the N-terminal domain. Human topoisomerase I has been shown to be inhibited by camptothecin (CPT), a plant alkaloid with antitumour activity. A binding mode for the anticancer drug camptothecin has been proposed on the basis of chemical and biochemical information combined with the three-dimensional structures of topoisomerase I-DNA complexes.

    More information about this protein can be found at Protein of the Month: DNA Topoisomerase.

    Proteins where this domain is known:
    PFE0520c   


    SSF56747 - Prim-pol domain (Superfamily link)

    Proteins where this domain is known:
    PF14_0366   


    SSF56752 - Aminotrans_IV (Superfamily link)

    Interpro entry IPR001544 : Aminotransferase, class IV (Interpro link)

    Interpro description:

    Aminotransferases share certain mechanistic features with other pyridoxal-phosphate dependent enzymes, such as the covalent binding of the pyridoxal-phosphate group to a lysine residue. On the basis of sequence similarity, these various enzymes can be grouped into subfamilies.

    One of these, called class-IV, currently consists of proteins of about 270 to 415 amino-acid residues that share a few regions of sequence similarity. Surprisingly, the best conserved region does not include the lysine residue to which the pyridoxal-phosphate group is known to be attached, in ilvE, but is located some 40 residues at the C terminus side of the pyridoxal-phosphate-lysine. The D-amino acid transferases (D-AAT), which are among the members of this entry, are required by bacteria to catalyse the synthesis of D-glutamic acid and D-alanine, which are essential constituents of bacterial cell wall and are the building block for other D-amino acids. Despite the difference in the structure of the substrates, D-AATs and L-ATTs have strong similarity.

    Proteins where this domain is known:
    PF14_0557   


    SSF56784 - HAD-like (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.275    MAL13P1.301    PF07_0059    PF07_0110    PF07_0115    PF10_0124    PF10_0169    PF10_0325    PF11_0190    PF11_0395    PF13_0334    PF14_0654    PFA0310c    PFC0150w    PFC0840w    PFE0195w    PFE0795c    PFE0805w    PFI0240c    PFL0305c    PFL0590c    PFL0950c    PFL1125w    PFL1260w    PFL1270w   


    SSF56801 - SSF56801 (Superfamily link)

    Proteins where this domain is known:
    MAL13P1.485    PF07_0129    PF10_0090    PF14_0751    PF14_0761    PFB0685c    PFB0695c    PFC0050c    PFD0085c    PFE1250w    PFF0945c    PFF1350c    PFL0035c    PFL1880w    PFL2570w   


    SSF56808 - Ribosomal_L1 (Superfamily link)

    Interpro entry IPR002143 : Ribosomal protein L1 (Interpro link)

    Interpro description:

    Ribosomes are the particles that catalyse mRNA-directed protein synthesis in all organisms. The codons of the mRNA are exposed on the ribosome to allow tRNA binding. This leads to the incorporation of amino acids into the growing polypeptide chain in accordance with the genetic information. Incoming amino acid monomers enter the ribosomal A site in the form of aminoacyl-tRNAs complexed with elongation factor Tu (EF-Tu) and GTP. The growing polypeptide chain, situated in the P site as peptidyl-tRNA, is then transferred to aminoacyl-tRNA and the new peptidyl-tRNA, extended by one residue, is translocated to the P site with the aid the elongation factor G (EF-G) and GTP as the deacylated tRNA is released from the ribosome through one or more exit sites. About 2/3 of the mass of the ribosome consists of RNA and 1/3 of protein. The proteins are named in accordance with the subunit of the ribosome which they belong to - the small (S1 to S31) and the large (L1 to L44). Usually they decorate the rRNA cores of the subunits.

    Many of ribosomal proteins, particularly those of the large subunit, are composed of a globular, surfaced-exposed domain with long finger-like projections that extend into the rRNA core to stabilise its structure. Most of the proteins interact with multiple RNA elements, often from different domains. In the large subunit, about 1/3 of the 23S rRNA nucleotides are at least in van der Waal's contact with protein, and L22 interacts with all six domains of the 23S rRNA. Proteins S4 and S7, which initiate assembly of the 16S rRNA, are located at junctions of five and four RNA helices, respectively. In this way proteins serve to organise and stabilise the rRNA tertiary structure. While the crucial activities of decoding and peptide transfer are RNA based, proteins play an active role in functions that may have evolved to streamline the pr